[Clipart] SVN changed and updated

Jochen Staerk jstaerk at usegroup.de
Tue Mar 3 13:50:55 PST 2009

Hash: SHA1

> I'm concerned about how to get a db copy that has privacy info
> scrubbed. I'm a bit paranoid about this. Does anyone know how
> Wikipedia handles this?

I'm not sure if wikipedia needs this, I could imagine the wikipedia
foundation runs the server so they could have full time professionals(?).

Suggestion: I'll have a look in the next days if my workaround brings me
any further. If yes, I don't need a database for another file release.
If no, I suggest to consider a QA system so that we can have test and QA
with the developers and production with, well, the productive system.

The QA system could then be an obfuscated production system, i.e. you
copy the database and then run some commands to anonymize the content
and remove the sensitive data.

I had some good experience with actually very trivial obfuscation, e.g.
all uppercase letters (except special characters) were replaced by a W,
all lowercase (except special characters) by a w and all numbers by a 9.
This was great, the widths of the rendered strings were similar and as
we left the tricky cases in (the special characters) all modules could
get used to real UTF8. In this case it similar content like a phone list
and it looked like

Wwwwww Wwäww 9999
Wwwwwwww Ww Wwwwww 9999
Www wé Wwwwwwww Wwwwwwwww ++9 999 9999999
Wwww W'Wwwww 9999


I think we did this with 2x26+10 mysql "replace" calls (we had the
command in a text file and reapplied it with phpmyadmin) but a regex on
the dump or simlar measures might also be possible.

kind regards,

- --

mit freundlichen Grüßen
Jochen Stärk

www.usegroup.de            (home office)
Albigerstr. 22             Am Wald 3
55232 Alzey                55270 Ober-Olm

Tel: (06731)997997-5       (06131)584278-0
Fax: (06731)997997-6       (06131)584278-1
Mobil: (0177)4512645

Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the clipart mailing list