Jump to content
  • 0

Issue: device remains "in use" if not closed on exit, until all connected digilent devices are closed


Guest

Question

Hi,

We have a measurement setup with multiple Analog Discovery 2 units hooked up to a single Linux computer.

The AD2 units are used by separate processes.

If multiple processes have opened AD2 devices, and one of them terminates without explicitly calling 'FDwfDeviceClose' (e.g. due to a hard KILL signal), that device remains unavailable, i.e., FDwfEnumDeviceIsOpened() returns 1 in its pfIsUsed parameter, even though the device is not in use anymore.

The only way I found to make such a device available for use again is to terminate all other programs that control an AD2 device. As soon as the last process controlling an AD2 terminates, all devices revert to 'available' again. (In this case, it doesn't seem to matter if the devices terminate with or without a proper call to 'FDwfDeviceClose'.)

I verified this behavior with 2 AD2s and also 2 Digilent Discovery devices hooked up to a single PC. The Digital Discoveries show the same behavior.

This behavior makes it quite a bit more difficult to use multiple devices on a single computer as we do.

Two questions:

* Where is the 'is in use' status stored? Is it perhaps a flag that is located in the device itself? Or is there somehow stored state in the controlling PC? This could help me understand the issue.

* Is this a fixable issue in the library? Or is it inherent in the way the 'in use' flag is implemented?

Kind regards,

  Sidney

Link to comment
Share on other sites

19 answers to this question

Recommended Posts

  • 0

Hi @reddish

The device info is cached by Adept Runtime in shared memory. This is required because some info may be unavailable when a device is opened by another process.
In case an app which uses a device crashes or is killed, the busy/opened flag will not be cleared. The clear is also performed in normal detach.
When the last app using Adept Runtime closes, this gets unloaded from memory together with the info cache.
However a stuck flag does not prevent an app to connect to a device.

Link to comment
Share on other sites

  • 0

Hi Attila,

 

Thanks for the info. Nice to understand what's going on behind the curtain with the shared memory, it helps me to understand the issue.

> However a stuck flag does not prevent an app to connect to a device.

Are you sure? My test suggests that it does prevent re-opening by another process.

I have four devices open, and kill one; it is then still listed as "in use".

When I try to open that same instrument after the kill, I get:

error (1): 'The device is being used by another application.\nDevice programming failed.'

 

Best, Sidney

Link to comment
Share on other sites

  • 0

Hi @reddish

I just tested it under Ubuntu 12.4 amd64.
I can connect to the device even after the other connected process was killed and busy flag is stuck.

Which Adept Runtime runtime version are you using?

I have tested it with 2.20.1 and .2

$ dpkg -l digilent.adept.runtime
||/ Name           Version        Description
+++-==============-==============-============================================
ii  digilent.adept 2.20.1         Digilent Adept Runtime

 

Link to comment
Share on other sites

  • 0

On the two machines I tested, I have been using version 2.19.2 of the adept runtime.

I'll be back in the lab early next week, upgrade the package, and re-run the test.

Link to comment
Share on other sites

  • 0

Hi Atilla,

Thanks for trying. On my side, I tried with Adept Runtime 2.20.2, and I still see the same behavior.

To be able to properly debug, I wrote a C program that can be used to test the behavior, see here: http://delft.jigsaw.nl/~sidney/dwf_openflag_issue.tgz.

On my Linux system, with 2 AnalogDiscovery 2 devices connected, I get this:

$ ./dwf_openflag_issue list
SN:210321AD644B --> 0
SN:210321AA214B --> 0

Next, I open two terminals and open both devices, using './dwf_openflag_issue SN:210321AD644B' and './dwf_openflag_issue SN:210321AA214B', respectively. The program opens the devices and keeps them open until the user either enters 'q' (close device, then quit) or 'qq' (don't close device, then quit).

As expected, I then get (in a third terminal):

$ ./dwf_openflag_issue  list
SN:210321AD644B --> 1
SN:210321AA214B --> 1

Now I enter 'qq' in the terminal where the program is running that has device SN:210321AD644B open:

[SN:210321AD644B] ***NOT*** closing device and quitting.
pure virtual method called
terminate called without an active exception
Aborted

So it seems that closing the program without a call to FDwfDeviceClose() makes the dwf library pretty unhappy, due to a pure virtual function call --- probably an attempt is made to execute an unimplemented destructor of a global C++ object. (This may be a bug in itself?)

As expected as per your explanation, the SN:210321AD644B device is still listed as 'in use':

$ ./dwf_openflag_issue list
SN:210321AD644B --> 1
SN:210321AA214B --> 1

Now, opening the 'SN:210321AD644B' device (which isn't really in use any more) fails like this on my system:

$ ./dwf_openflag_issue SN:210321AD644B
[SN:210321AD644B] Opening device ...
[SN:210321AD644B] Opening device FAILED with error code 0x01...
[SN:210321AD644B] Error message:

The device is being used by another application.
Device programming failed.

[SN:210321AD644B] Exiting program.

Whereas, if I understood you correctly, you predict that this should succeed.

Perhaps you can try to see if you see different behavior from what I am seeing?

 

Best,

  Sidney

Link to comment
Share on other sites

  • 0

Hi @reddish

With 2 devices I can reproduce the issue you are describing.
Earlier I have tried with only one device and two application.

@malexander

Having 2 devices with separate apps connected to them. After killing one app or exit without disable/close, the following apps can't connect the device.
The JtscInitScanChain fails with scerrInitFailed. May be the DPTI enabled flag remains set in FTDIFW and JTAG enable fails?
Could we solve this problem?

Link to comment
Share on other sites

  • 0

@attila

There's a lock associated with each interface that gets acquired when the enable command is executed and released when disable is executed. The lock has to be shared across processes so it doesn't go away until all applications referencing the runtime libraries are closed. The type of lock I'm using should be recoverable from a dead thread/process but perhaps that's not working correctly. I'll look into this while I'm working on the next release.

Thanks,
Michael

Link to comment
Share on other sites

  • 0

Hi,

I just tried with adept 2.26.1 and waveforms 3.18.1, and the issue is still present.

Is there an intent to handle this issue? In an environment where multiple AD2's are used on the same machine this is an annoying bug.

Kind regards!

Link to comment
Share on other sites

  • 0

I would like to fix this issue but got busy with other things and haven't taken the time to dig into it. I quickly glanced at the code again after re-reading the problem description I suspect that this same issue may also be present in Windows. It's not clear to me that there's a way to address this inside of the DLL without re-writing a significant amount of code. I will look at this some more next week and try to remember why there is a single shared table that contains the conflict masks for all devices, rather than separate shared memory segments for each individual device.

Link to comment
Share on other sites

  • 0

Hi @malexander

I reported this bug well over 2 years ago, I really think it deserves a bit of priority at this point. It is quite a nuisance in setups using several AD2's.

I'd appreciate if you can provide an unambiguous statement on when this will be fixed, or even that you won't be fixing it at all. Although not fixing clear bugs would not be a good look, IMHO.

Cheers Sidney

Link to comment
Share on other sites

  • 0


The expectation from the perspective of the Adept Runtime DLLs was always that the application using the APIs would call the appropriate functions to release resources once it is done with them. The DLLs/shared libraries use process attach and process detach (and GCC equivalent on other platforms) to perform some initialization and cleanup but they don't register signal handlers since that's typically done at the application level, and not inside of a library. An application can register the appropriate signal handlers and call the disable and close functions, which does avoid this problem. We should be able to do that in Waveforms and so should 3rd party applications using the SDKs, at least those written in C/C++.

I was able to replicate the issue you are describing in Windows by connecting two Analog Discovery 2 devices, opening each one in Waveforms, and then using task manager to kill one instance of Waveforms. 

Since multiple processes need to be able to access the device there is a variable in shared memory (protected by mutex) that keeps track of which interfaces are enabled, and is used to check for a conflict when a thread/process attempts to enable an interface on a handle. When Adept receives a request to enable a port it checks for a conflict, and if there is no conflict it will set a bit indicating which resource was enabled and allow the rest of the port initialization code to execute. When an application is done with an interface it's expected to call the associated disable function and then close the interface handle. When Adept processes such request it clears the bit in the variable that keeps track of the in use resource for the device. The variable in shared memory segment is only initialized the first time the DLL is loaded into RAM and in the scenario that this issue occurs the DLL remains loaded into RAM because there is still an application accessing the other device.

In actuality there two are locking mechanism. The one I described above, which is the original locking mechanism, and secondary locking mechanism that was later added for applications that access the device without the Adept API's but must share access to the hardware. The secondary locking mechanism also involves storing data structures in shared memory, but is different in that each device interface has it's own mutex associated with it and when an application wants access to the interface it must lock the mutex corresponding to that interface. In this case the mutex can be recovered if the application is terminated. Adept also uses this mechanism after it performs the original conflict check so a potential workaround could involve skipping the original conflict check.

I have confirmed that removing the call that checks to see if there is a resource conflict when enabling the interface and then relying on the secondary (newer) mechanism to prevent another process from accessing an in use resource works in the Adept test suite and that the resource lock can be recovered if owning thread was killed. However, I still see Waveforms listing the device as in use by another process and will need to discuss this with the developer. Further, there are other resources in shared memory that aren't getting cleaned up when you kill the process without calling the appropriate disable or close functions and it's not immediately obvious to me what impact that may have. I'm also hesitant to remove the original resource conflict check because at the time that the secondary locking mechanism was added (late 2013) there was at least one 3rd party developer who (against my recommendation) was distributing the shared libraries alongside their executable so that they could load a specific version of the Runtime libraries, instead of using the same version system wide, which is how the system was designed be used. There's no way for me to know how many 3rd party developers might be including old versions of Adept Runtime DLLs alongside their application and loading those instead of the version of the DLLs that are in the system location so there's some risk that making this change would cause unexpected problems for other developers.

You asked about timeline. If we were to make this change, and if it only involved making the changes I already tried, then the majority of the time would be spent on testing with different hardware and software combinations across multiple platforms. I imagine that might take a day or two, depending on how thorough the testing is. However, that doesn't mean this will get done immediately because it involves diverting resources from other projects. We will discuss this internally to determine the full impact of the change and a timeline for when it will be implemented if we do make the change.

Link to comment
Share on other sites

  • 0

Hi @malexander

The expectation from the perspective of the Adept Runtime DLLs was always that the application using the APIs would call the appropriate functions to release resources once it is done with them.

That is not an assumption that a robust library can make I'm afraid.

Adding exit handlers would not be the right way to address this issue, because it is possible for a process to terminate in a way precludes exit handlers from executing, eg. calling the POSIX _exit() function or being killed by the OS; and then you're back to square one. A proper solution would be robust to that.

Please share the timeline (or the decision not to fix it) after your internal discussion. This has been in limbo for too long.

Best, Sidney

Link to comment
Share on other sites

  • 0

Hi @reddish

I've implemented a workaround in the Adept Runtime to address the issue you are experiencing with the Analog Discovery 2 and Digital Discovery. I tested the workaround with the desktop version of Waveforms and found that after hard killing a process that's connected to a device you can launch another copy of Waveforms and reconnect to the device. Waveforms still says that the device is in use because it looks at the open handle reference count, which is in shared memory and not updated when a process is hard killed, but you can now connect to the device again without the need to close all other processes referencing the Adept Runtime. This should also work with Waveforms SDK.

Please install Adept Runtime 2.27.6 using one of the packages below and let me know if this solves your issue.

Thanks,
Michael

32-bit and 64-bit x86 Windows

32-bit x86 Linux Debian Package

32-bit x86 Linux RPM Package

64-bit x86_64 Linux Debian Package

64-bit x86_64 Linux RPM Package

ARMHF Linux Debian Package

ARMHF Linux RPM Package

ARM64 Linux Debian Package

ARM64 Linux RPM Package
 

Link to comment
Share on other sites

  • 0

Hi @malexander,

 

we have a similiar issue with an DigilentDigitalDiscovery. We are using the WaveformsSDK to automate measurements of PWM-Signals with the DigilentDigitalDiscovery. The costum application (.dll) witch uses the dwf.cs-wrapper for the dwf.lib is written in C#. 

The problem is, that sometimes the digilent device remains in usage state if an error occurs (green LED is on). In that state we lost the handle and are not able to reconnect using the FDwfDeviceOpen function. The error message says the device is already in usage. But it was opened in a already killed task and therefore the handle is lost. 
We tetstet the workaround using the Adapt Runtime 2.27.6 but can not reopen the Device. Is there any possiblity to get the handle of an already in usage device? The other enumartion function are working (enumerate, get serial number, devices type tec.).

Best regards
Jörg 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...