web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

No record found.

Community site session details

Community site session details

Session Id :
Microsoft Dynamics AX (Archived)

Waves crash due to SPID reaching max capacity

(0) ShareShare
ReportReport
Posted on by 796

Long time no talk AX Community (and apologies if this is showing up twice, I hit post and it disappeared).

I am wondering if anyone out there has seen the below situation  We are running AX 2012, CU11 with Advanced Warehousing enabled.  Waves are set to process in batch and parallel allocation is enabled on the allocatewave method and is set to 6 (We tried different variations and six seemed to avoid item allocation issues).

 

Several weeks ago, when waves were released, users started getting the error in screenshot 1 - "Error accessing database connection".  If I ran the wave, it would run.  Then, in following days, the same error would occur but then it wouldn't run for me, unless I disabled parallel allocation on the method.  Then it started failing and wouldn't even process if I disabled the parameter in warehouse management, to process in batch (it seems to process now, but only if done through a client session).

Similar errors throughout the process, but only in the AOS, "database connection errors, communication link errors, fatal sql login, etc.  No errors in SQL or in AX.  Only in the event logs of the batch AOS.

 

The thought was tcpip port shortage and following an article on Technet, we set the tcpip to a 30 timeout in the registry and a restart.  That seemed to work but four days later, the error returned. 

 

After some poking around, I discovered that when the batch job would run, SQL server SPIDS would skyrocket from <500 to maxing out at 32,767 (I captured it below, after it had started draining.  It then very slowly drains them (1-4 per second)).  They're in a sleeping state status, with command of awaiting command.  On a test server, it exhibits similar behavior, except that while it generates thousands of SPIDs, it seems to be reusing them, vs PROD, where they just keep generating new ones.

In the meantime, I've asked a coworker to try this out in a Contoso instance to see what the SPID behavior is.

Error-accessing-database-connection.jpg

UserSessions.jpg

*This post is locked for comments

I have the same question (0)
  • Verified answer
    Guy Terry Profile Picture
    28,924 Moderator on at

    I probably don't have any answer for your problem, aside from vaguely suggesting that this 'newer than CU11' hotfix might help: KB4024685 Multithreading induces issues during Automatic release of sales orders by using the Wave Processing logic.

    However, I did want to say that 'WHSWorkCreateHistory' is the Work creation history log. Would it be too much to hope that turning off 'Create work creation history log' (in Warehouse management parameters) would solve your error?

  • Suggested answer
    Ivan (Vanya) Kashperuk Profile Picture
    on at

    I think Guy is right, just not sure if that KB pulls the right dependencies in.

    The specific issue with SPIDs being consumed too much is due to UserConnection not calling finalize after use, and the wave processing flow creates a bunch of separate user connections.

    You can check a few of the places, like

    WHSPostEngine::createWaveExecutionHistoryLine to see if finalize is called - since you are on 6.3, there's no finally support yet, so it's done through some SysConnection tracker

  • Solozmar Profile Picture
    796 on at

    This one was interesting.  It came down to the work creation history log not being the root cause but just that we have far too many Location Directives and Work templates and with the sequencing set the way it was, the number of failures were immense and spawned thousands of threads.  

    Particularly on replenishment where not all of our items yet have fixed locations, so they are directed to a location profile type that contains over 100 "dynamic" locations, so can have hundreds of failures there alone, on dozens of items.  

  • lennartc Profile Picture
    70 on at

    We have also discovered issues on older versions where the connection in WHSWaveStepController::createControlRecord was not finalized which could leave the sessions open.

    Adding the connection.finalize() after the ttsabort and ttscommit on customer version of WHSWaveStepController::createControlRecord.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Neeraj Kumar – Community Spotlight

We are honored to recognize Neeraj Kumar as our Community Spotlight honoree for…

Leaderboard > 🔒一 Microsoft Dynamics AX (Archived)

#1
Martin Dráb Profile Picture

Martin Dráb 4 Most Valuable Professional

#1
Priya_K Profile Picture

Priya_K 4

#3
MyDynamicsNAV Profile Picture

MyDynamicsNAV 2

Last 30 days Overall leaderboard

Featured topics

Product updates

Dynamics 365 release plans