Long time no talk AX Community (and apologies if this is showing up twice, I hit post and it disappeared).
I am wondering if anyone out there has seen the below situation We are running AX 2012, CU11 with Advanced Warehousing enabled. Waves are set to process in batch and parallel allocation is enabled on the allocatewave method and is set to 6 (We tried different variations and six seemed to avoid item allocation issues).
Several weeks ago, when waves were released, users started getting the error in screenshot 1 - "Error accessing database connection". If I ran the wave, it would run. Then, in following days, the same error would occur but then it wouldn't run for me, unless I disabled parallel allocation on the method. Then it started failing and wouldn't even process if I disabled the parameter in warehouse management, to process in batch (it seems to process now, but only if done through a client session).
Similar errors throughout the process, but only in the AOS, "database connection errors, communication link errors, fatal sql login, etc. No errors in SQL or in AX. Only in the event logs of the batch AOS.
The thought was tcpip port shortage and following an article on Technet, we set the tcpip to a 30 timeout in the registry and a restart. That seemed to work but four days later, the error returned.
After some poking around, I discovered that when the batch job would run, SQL server SPIDS would skyrocket from <500 to maxing out at 32,767 (I captured it below, after it had started draining. It then very slowly drains them (1-4 per second)). They're in a sleeping state status, with command of awaiting command. On a test server, it exhibits similar behavior, except that while it generates thousands of SPIDs, it seems to be reusing them, vs PROD, where they just keep generating new ones.
In the meantime, I've asked a coworker to try this out in a Contoso instance to see what the SPID behavior is.

