Data Cleansing...

Author
Discussion

AB

Original Poster:

17,408 posts

202 months

Thursday 19th March 2009
quotequote all
I understand what it is, but is it big? Is there a massive call for companies to come along and offer data cleansing services?


Mattygooner

5,301 posts

211 months

Thursday 19th March 2009
quotequote all
Is it like ethnic? only with data?


V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Getting rid of all the numbers you don't want HMRC/shareholders/the wife etc seeing? smile

AB

Original Poster:

17,408 posts

202 months

Thursday 19th March 2009
quotequote all
From what I understand it is a program written with a set of rules set up. It then goes through all the data and 'cleanses it' - from simple rules such as 'delete all data that hasnt been updated for 6 months' to ... something more complicated.

I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?


V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Google brings up many companies offering such services. Crowded market it seems.

michaeljclark

613 posts

238 months

Thursday 19th March 2009
quotequote all
Well I need some of that!

On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........

V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
michaeljclark said:
Well I need some of that!

On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
That's just bad-housekeeping!

I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.

shouldbworking

4,773 posts

219 months

Thursday 19th March 2009
quotequote all
It's a term thats horribly misapplied to any number of data related activities. Just say what exactly you want.

Jasandjules

70,505 posts

236 months

Thursday 19th March 2009
quotequote all
V8mate said:
I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
I agree. It's more for those who keep confidential records for six years, once those six years is up then you want to delete the data and make sure it is deleted properly.

V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Jasandjules said:
V8mate said:
I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
I agree. It's more for those who keep confidential records for six years, once those six years is up then you want to delete the data and make sure it is deleted properly.
nono It's not data disposal.

It's a data quality 'thing' http://en.wikipedia.org/wiki/Data_cleansing

Jasandjules

70,505 posts

236 months

Thursday 19th March 2009
quotequote all
V8mate said:
nono It's not data disposal.

It's a data quality 'thing' http://en.wikipedia.org/wiki/Data_cleansing
So simply a quality control thing? My old bank used to have a team for that sort of thing internally.

michaeljclark

613 posts

238 months

Thursday 19th March 2009
quotequote all
V8mate said:
michaeljclark said:
Well I need some of that!

On my primary file server where all my users save their work, I have 800Gb of data. Of that 500Gb hasn't been touched in over a year.........
That's just bad-housekeeping!

I think data-cleansing is just for financial docs, spreadsheets etc. Spotting inaccuracies etc.
May be so, however the FSA require that we keep client data for minimum of 5 years - that amounts for a fair portion of it.

Every so often I have a clear up, but it's brave man to get users to clear their data up

V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Jasandjules said:
V8mate said:
nono It's not data disposal.

It's a data quality 'thing' http://en.wikipedia.org/wiki/Data_cleansing
So simply a quality control thing? My old bank used to have a team for that sort of thing internally.
hehe

Your use of language made me chuckle... "My old bank..." you sound like some old City duffer who sold the family bank to Morgan Stanley is the 80s hehe

Jasandjules

70,505 posts

236 months

Thursday 19th March 2009
quotequote all
V8mate said:
hehe

Your use of language made me chuckle... "My old bank..." you sound like some old City duffer who sold the family bank to Morgan Stanley is the 80s hehe
LOL, I feel like an old duffer.......... I guess it's once a merchant banker always a, you get the picture.............

V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Jasandjules said:
V8mate said:
hehe

Your use of language made me chuckle... "My old bank..." you sound like some old City duffer who sold the family bank to Morgan Stanley is the 80s hehe
LOL, I feel like an old duffer.......... I guess it's once a merchant banker always a, you get the picture.............
Still working in London?

Coming to DDs next week?

Jasandjules

70,505 posts

236 months

Thursday 19th March 2009
quotequote all
V8mate said:
Still working in London?

Coming to DDs next week?
I am pleased to say that I bailed out of the City a few years ago...... Now I live out in the real countryside..... Sort of.

ChristianZS

2,640 posts

220 months

Thursday 19th March 2009
quotequote all
AB said:
From what I understand it is a program written with a set of rules set up. It then goes through all the data and 'cleanses it' - from simple rules such as 'delete all data that hasnt been updated for 6 months' to ... something more complicated.

I was told yesterday that it is massive, and even more so in a recession as companies will want to make sure their data is optimised in order to not waste valuable marketing budget etc by sending mailshots or the like to companies that no longer trade etc?
Its not far off that. You remove the records which are proven not to be as good, Scan it against what you have already got, Check the numbers are not on TPS, Make sure its not on your MDNC lists and any other criteria you wish not to call..

IIRC It can take a while especially if you have calling DB's such as ours with a fair few million unique records.

merc_man

1,926 posts

209 months

Thursday 19th March 2009
quotequote all
Data cleansing is a somewhat misused term. It is more a case of data enhancement.

The most common cases you'll come across will be in dealing with name and address data. A typical customer may have more than one source of address data and this can have been input either by the customer themselves, an operator or sourced from a third party. It's highly likely that the information recieved will be in differening formats, badly formatted or just downright bad (bad spelling etc.). The likes of Trillium et al are well versed in the area of cleaning up this data based on vast collections of rules, formats and reference data (i.e. PAF files).

You'll often hear the phrase in orginations of a "single version of the truth" and this is what data cleansing aims to provide.

V8mate

45,899 posts

196 months

Thursday 19th March 2009
quotequote all
Jasandjules said:
V8mate said:
Still working in London?

Coming to DDs next week?
I am pleased to say that I bailed out of the City a few years ago...... Now I live out in the real countryside..... Sort of.
Me too - Colchester is a bit of a schizo-town; sits in Essex but has a much more 'Suffolky' feel.

Orb the Impaler

1,881 posts

197 months

Thursday 19th March 2009
quotequote all
Often done when companies are migrating data from one computer application to another - you might decide you want to junk every record over such an age or perhaps not migrate people who have changed certain address details etc.

Done *loads* of these over the years and I'm trying to get work doing more. Can I get even an interview - can I fk! frown