![]() How do you do that in a JOIN? Which JOIN Gets you the Unmatched Rows? And sometimes, you also need to keep the unmatched rows. Sometimes, you need only the data that have a match in both tables. I tried the query against another table with 5 million rows and that one ran in about 5s, so it appears to be largely O(n).If you often join tables in SQL, you’ve probably noticed not all data from one table corresponds to data from another table all the time. I would run the query against the system views or information_schema views. I'm not sure how datatype changes between tables would affect this calculation. ![]() However, if the results do match, you're not guaranteed that the tables are identical because of the chance of checksum collisions. If the results of those don't match, you know the tables are different. SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM TableB This is actually extremely fast, each one running against 3 million rows in about 2.5s: SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM TableA The time between the two queries is identical. The tree cost only differs on the last step before returning rows, the concatenation.Īctually executing either query takes about 42s plus about 3s to actually transmit the rows. The execution plans for the queries has a total cost of 184.25879 for UNION and 184.22983 for UNION ALL. The primary key on the table is weird, however, as it's a composite key of 10 fields (it's an audit table). The table has about 3 million rows, and there's about 25000 rows different. I've run the query against my system which compares two tables with 21 fields of regular types in two different databases attached to the same server running SQL Server 2005. It's worked well enough on tables that are about 1,000,000 rows, but I'm not sure how well that would work on extremely large tables. Here's what I've done before: (SELECT 'TableA', * FROM TableA WHERE IsNull(ST.chksum,0) IsNull(TT.chksum,0) ![]() LEFT JOIN #ChkSumSourceTables ST ON TT.Name = ST.Name execute dynamic statements - populate temp tables with checksumsĮXEC the two databases to find any checksums that are different + 'UPDATE #ChkSumSourceTables SET = (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM ' + 'UPDATE #ChkSumTargetTables SET = (SELECT CHECKSUM_AGG(BINARY_CHECKSUM(*)) FROM ' T.name like build a dynamic sql statement to populate temp tables with the checksums of each table ![]() T.name like create a temp table that lists all tables in source databaseĬREATE TABLE #ChkSumSourceTables ( varchar(250), varchar(50), chksum int) INNER JOIN S ON T.schema_id = S.schema_id create a temp table that lists all tables in target databaseĬREATE TABLE #ChkSumTargetTables ( varchar(250), varchar(50), chksum int) parameter = if no table name was passed do them all, otherwise just check the one Thanks to answers below for pointing me in the right direction. It worked so well we're doing it on every table in each database. here is the exact approach I ended up taking. But, I'd like to explore the hash idea a little further if possible.įor any future vistors. One approach that intrigues me is this creative use of the union statement. We have Red-Gate data compare but since the tables in question contain millions of rows each I'd like something a little more performant. I'm talking both schema and data.Ĭan I do a hash on the table it's self like I would be able to on an individual file or filegroup - to compare one to the other. What is the quickest way to verify that those tables (on two different servers) are in fact identical. When all is said and done there are a bunch of tables that should be identical.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |