We found out that it is AA causing the issue - they ran the following scrip and got the same error in SQL
I think I found the culprit SQL…
I ran a new trace with more events and columns, and was looking at the error occurrence at 14:02:34.
This has the same SPID as the errors we were looking at, and is the only other one that has a value other than null or 0 in the error column.
Also take note of the duration (in milliseconds)… about 9.7 minutes.
It’s a batch, with a monster SQL statement following a couple of statements to detect changes and then two more statements to drop the resulting temp tables.
select * into #A5_ChangeTemp from CHANGETABLE(CHANGES AAG00905, 6817) A5_CT where A5_CT.SYS_CHANGE_VERSION <= 17219 AND A5_CT.SYS_CHANGE_OPERATION != 'D'
select * into #A4_ChangeTemp from CHANGETABLE(CHANGES AAG00904, 6817) A4_CT where A4_CT.SYS_CHANGE_VERSION <= 17219 AND A4_CT.SYS_CHANGE_OPERATION != 'D'
Monster query here, that references those temp tables
drop table #A5_ChangeTemp drop table #A4_ChangeTemp
I changed the names of those temp tables and ran the first two manually, then edited the monster query to reference my temp tables, then ran it. It took almost 10 minutes and gave the same error. It has an order by and lots of joins, so I don’t doubt that it would be trying to create a worktable and running up against the SQL row length limit with how many columns are referenced in the query.
Also I was seeing hints in the configuration console that this service checks for tasks to work on every minute. We were seeing gaps of about 11 minutes between the errors. So I think it is trying this for 10 minutes and failing, then creating a new task to try again in a minute. So that seems to fit too.
It’s a complex query with multiple layers of sub queries so it would take some work to really pick it apart. But Keith and I were just looking through it briefly together and he has some thoughts on what it is up to and things we might try to see if we can make the error go away by reducing something.