Data Cleansing...
Discussion
From what I understand it is a program written with a set of rules set up. It then goes through all the data and 'cleanses it' - from simple rules such as 'delete all data that hasnt been updated for 6 months' to ... something more complicated.
I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?
I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?
michaeljclark said:
Well I need some of that!
On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
That's just bad-housekeeping!On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
Jasandjules said:
V8mate said:
I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
I agree. It's more for those who keep confidential records for six years, once those six years is up then you want to delete the data and make sure it is deleted properly.It's a data quality 'thing' http://en.wikipedia.org/wiki/Data_cleansing
V8mate said:
So simply a quality control thing? My old bank used to have a team for that sort of thing internally.V8mate said:
michaeljclark said:
Well I need some of that!
On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
That's just bad-housekeeping!On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
Every so often I have a clear up, but it's brave man to get users to clear their data up
Jasandjules said:
V8mate said:
So simply a quality control thing? My old bank used to have a team for that sort of thing internally.Your use of language made me chuckle... "My old bank..." you sound like some old City duffer who sold the family bank to Morgan Stanley is the 80s
Jasandjules said:
V8mate said:
Your use of language made me chuckle... "My old bank..." you sound like some old City duffer who sold the family bank to Morgan Stanley is the 80s
Coming to DDs next week?
AB said:
From what I understand it is a program written with a set of rules set up. It then goes through all the data and 'cleanses it' - from simple rules such as 'delete all data that hasnt been updated for 6 months' to ... something more complicated.
I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?
Its not far off that. You remove the records which are proven not to be as good, Scan it against what you have already got, Check the numbers are not on TPS, Make sure its not on your MDNC lists and any other criteria you wish not to call..I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?
IIRC It can take a while especially if you have calling DB's such as ours with a fair few million unique records.
Data cleansing is a somewhat misused term. It is more a case of data enhancement.
The most common cases you'll come across will be in dealing with name and address data. A typical customer may have more than one source of address data and this can have been input either by the customer themselves, an operator or sourced from a third party. It's highly likely that the information recieved will be in differening formats, badly formatted or just downright bad (bad spelling etc.). The likes of Trillium et al are well versed in the area of cleaning up this data based on vast collections of rules, formats and reference data (i.e. PAF files).
You'll often hear the phrase in orginations of a "single version of the truth" and this is what data cleansing aims to provide.
The most common cases you'll come across will be in dealing with name and address data. A typical customer may have more than one source of address data and this can have been input either by the customer themselves, an operator or sourced from a third party. It's highly likely that the information recieved will be in differening formats, badly formatted or just downright bad (bad spelling etc.). The likes of Trillium et al are well versed in the area of cleaning up this data based on vast collections of rules, formats and reference data (i.e. PAF files).
You'll often hear the phrase in orginations of a "single version of the truth" and this is what data cleansing aims to provide.
Jasandjules said:
V8mate said:
I am pleased to say that I bailed out of the City a few years ago...... Now I live out in the real countryside..... Sort of.Often done when companies are migrating data from one computer application to another - you might decide you want to junk every record over such an age or perhaps not migrate people who have changed certain address details etc.
Done *loads* of these over the years and I'm trying to get work doing more. Can I get even an interview - can I fk!
Done *loads* of these over the years and I'm trying to get work doing more. Can I get even an interview - can I fk!
Gassing Station | The Pie & Piston Archive | Top of Page | What's New | My Stuff