The join table D is on two columns: D.c1 = A.c1 and D.c2 = A.c2. If I comment out one of the columns, the left join appears to work, so the issue may be related to a left join with multiple columns constrained. The two columns being specified are the first two of three columns in the primary key on table D. Removing 'c1' from the join results in the join functioning - and also makes the join non-SARGable. Removing c2 from the join still produces wrong results, but is still SARGable.
Definitely something with SARGable left join on a multi-column primary key.
This doesn't work:
Item it join Batch b on b.FileID = it.FileID and b.LineNumber = it.BatchLineNumber join [File] f on f.FileID = it.FileID left join ItemAnomaly ia on ia.SuccessorLineNumber = it.LineNumber and ia.SuccessorFileID = (it.FileID + 1) left join AnomalyType at on at.AnomalyTypeID = ia.SuccessorAnomalyTypeID
But this does:
Item it join Batch b on b.FileID = it.FileID and b.LineNumber = it.BatchLineNumber join [File] f on f.FileID = it.FileID left join ItemAnomaly ia on (ia.SuccessorLineNumber + 1) = (it.LineNumber + 1) and (ia.SuccessorFileID +1 ) = (it.FileID + 1) left join AnomalyType at on at.AnomalyTypeID = ia.SuccessorAnomalyTypeID
If I drop the primary key and add a (non-unique) index on the two columns I'm trying to join, the results are still incorrect - the left join is matching on all rows.
Changing the join order doesn't make any difference.
Removing the sort doesn't make any difference.
I have a database and query that I could upload - the database is about 650mb ZIP'd. I suspect I could repro in a much smaller database with some work, but I've got a fast internet connection, so I don't mind pushing the big file.
Ah ha! Adding foreign key constraints causes the query to run correctly. Of course, the existence of a foreign key constraint should never affect the results of a query.
Here's a complete repro script, starting from an empty database:
CREATE TABLE [AnomalyType] ( [AnomalyTypeID] Int NOT NULL, [Description] NVarChar(50) NOT NULL ); CREATE TABLE [Item] ( [FileID] Int NOT NULL, [LineNumber] Int NOT NULL, [TraceNumber] BigInt NOT NULL ); CREATE TABLE [ItemAnomaly] ( [FileID] Int NOT NULL, [LineNumber] Int NOT NULL, [AnomalyTypeID] Int NOT NULL ); ALTER TABLE [AnomalyType] ADD CONSTRAINT [PK_NewTable3] PRIMARY KEY ([AnomalyTypeID]); ALTER TABLE [Item] ADD CONSTRAINT [PK_NewTable1] PRIMARY KEY ([FileID],[LineNumber]); ALTER TABLE [ItemAnomaly] ADD CONSTRAINT [PK_NewTable2] PRIMARY KEY ([FileID],[LineNumber],[AnomalyTypeID]); insert Item (FileID, LineNumber, TraceNumber) values (1,1,12345) insert Item (FileID, LineNumber, TraceNumber) values (1,2,23456) insert Item (FileID, LineNumber, TraceNumber) values (2,1,34567) insert ItemAnomaly (FileID,LineNumber,AnomalyTypeID) values (1,2,2) insert ItemAnomaly (FileID,LineNumber,AnomalyTypeID) values (2,2,2) insert AnomalyType (AnomalyTypeID, [Description]) values (0,'Nothing') insert AnomalyType (AnomalyTypeID, [Description]) values (1,'AT 1') insert AnomalyType (AnomalyTypeID, [Description]) values (2,'AT 2') select * from Item it left join ItemAnomaly ia on ia.FileID=it.FileID and ia.LineNumber=it.LineNumber left join AnomalyType at on at.AnomalyTypeID=ia.AnomalyTypeID -- Expected Results /* 1 1 12345 1 2 23456 1 2 2 2 AT 2 2 1 34567 */ -- Actual Results /* 1 1 12345 1 2 2 2 AT 2 1 2 23456 1 2 2 2 AT 2 2 1 34567 2 2 2 2 AT 2 */ -- Adding foreign key constraints ALTER TABLE [ItemAnomaly] ADD CONSTRAINT [FK_ItemAnomaly_AnomalyType] FOREIGN KEY ([AnomalyTypeID]) REFERENCES [AnomalyType]([AnomalyTypeID]); ALTER TABLE [ItemAnomaly] ADD CONSTRAINT [FK_ItemAnomaly_Item] FOREIGN KEY ([FileID],[LineNumber]) REFERENCES [Item]([FileID],[LineNumber]); -- With the foreign key constraints in place, the results of the query are correct and fast
If you can send the example database to our TransferBigFiles drop-box (see here) we can test it against 5.7.8 and 6.0 pretty quickly--as well as check whether the fixes that we'll be releasing very soon will fix it or if it will need more specific investigation.
Ah, thanks for the small repro example. We were able to confirm the difference in behavior with that example between 5.7.8 and 6.0.4, and that will be easier to debug through than with the original large database.
It looks like this was fixed in VDB 6.0.5. Thanks!
Sorry, no - it's not fixed in 6.0.5. I fooled myself by still having the foreign key constraints in place.
Any estimate of when this will be fixed? I'm trying to get a product out the door - I can go with one of the workarounds, but I'd rather have a fix, if it's coming soon.
We're actively investigating this specific problem, and we believe we are close to identifying a fix. We'll be releasing another update once we have fixes for this and one or two other reported regressions, probably before the end of the month.
If you need a fixed version even more urgently, please open a ticket and we can give you an earlier build as soon as this specific fix is ready without waiting on fixes for other issues that we'd like to also include in the next general update.
Thanks for the update. I'll wait for the next general release - end of the month would be awesome.
We've just released 6.0 Update 2 (version 6.0.8) which fixes this issue.
Confirmed fixed! Thanks for the update.
Carl Daniel
Are there any known bugs with left joins not working correctly in VDB6?
I have a query that joins five tables: A join B join C left join D left join E
I expect maybe 1% of the rows to have matching rows from tables D and E, but the query appears to be returning an arbitrary row from table D and E rather than NULL.
The query is constrained by conditions on columns of table A.
All of the joins are SARGable
The query contains an order by clause that requires an explicit sort.
I can supply more details - schema, etc., if needed - just wanted to get this out there in case this sketch is enough.
I believe that this query worked correctly with 5.8.7 - I can verify that, but it'll take a while to rebuild the application and re-generate the database.