February 27, 2020 / by Open Data Portal Team / In Open Data , Data Portal

Lobbyist Dataset Deduplication Corrections

We have discovered that, as a side-effect of revisions that registered lobbyists are permitted to make to their reported compensation, and that they actually do make fairly often, some of the Lobbyist Data datasets have what amount to duplicate records.

We have introduced new logic to attempt to eliminate this duplication and will be applying it to the datasets shortly. As a consequence, the values of the ID columns (e.g., LOBBYIST_ID) will change. They will remain internally consistent but will not match previous values. Please make any necessary changes to your uses of the data and export copies of the current datasets if you will need them. If you do not get a chance to export them before we make this change, you should be able to use the Dataset Snapshot feature in each dataset to download a copy. (Please note that, as a general rule, you should not rely on these backups being available for datasets because of how the datasets are normally updated but we will manually force backups in this case. If you will need them, please download them as soon as possible since we cannot guarantee how long they will remain available.)

The other significant change is that some incomplete records will now be filtered out in some datasets:

As always, please contact the City of Chicago Open Data Team at dataportal@cityofchicago.org or @ChicagoCDO with any questions related to using the Data Portal. However, please direct any subject matter questions about these datasets to the Board of Ethics at (312) 744-9660, by email to its Executive Director at steve.berlin@cityofchicago.org, or by direct message to @ChicagoEthicsBd.