[Clipart] Namespace conflicts on the filenames.

Jonadab the Unsightly One jonadab at bright.net
Wed Jan 5 19:34:18 PST 2005


[Note:  I have CCed another mailing list because I know it has some
Macheads on it, who might be able to answer the question below about
StuffIt Expander and filename length.  Please reply to the clipart
list (only) unless you subscribe to the other list.]

Nicu Buculei <nicu at apsro.com> writes:

> another way to receive unwanted duplicates is inside tarballs - this
> make harder to compare filenames

This is one of the reasons (metadata propagation being another) that I
wanted to roll the unpackaging code into the upload script, but for
performance reasons that isn't going to be a good solution.  However,
it _is_ possible to have a separate script that does the unpackaging
in a manner that takes these issues (both metadata and filename
conflicts) into account.

>> How can we track what filenames have been used, across releases?
>> A special directory of zero-size files perhaps?  Some other way?
>
> how about generating *unique* file names? for example adding to the
> initial file names some letters from author name and some random (or
> not random, say incremental) string.
> example: my snowman.svg become nb_snowman_1024579.svg. it may be
> enough to title the file 1024579_snowman.svg

I think I like the idea of including some or all of the author's
name.  That would at least prevent one person's submissions from
trampling over another's.  But I think it would be better to have the
title first...  hmm...

What about title_author_yymmdd_nn.svg

For example, if today I uploaded two SVG images both entitled
"Winter's Beauty", they would be
winter_s_beauty_jonadab_20050105_01.svg and
winter_s_beauty_jonadab_20050105_02.svg

Would that be a reasonable arrangement?  It would be pretty easy to
implement.  These are, of course, just filenames.  The metadata will
still be whatever it is, so that once the DMS is fully integrated
we'll be able to search, sort, and categorize by their real titles and
keywords and stuff.

The bummer is, this makes rather lengthy filenames, which has started
me thinking...

Could this become a problem on some platforms potentially?  What's the
shortest limit on any system still currently used for workstations?  I
think MacOS 9 has a pretty short limit, doesn't it?  (Anybody remember
_how_ short?  Was it 256?  I remember it's longer than any name you
would ever normally give a file but short enough it'd be easy to run
into in obscure edge cases like this one...)  What does StuffIt
Expander or whatever do if it encounters a filename longer than that?
(If it just truncates it, then I won't worry about it.  If it refuses
to extract the file, then I might have to do a length check and
truncate the title or author or something.)

Update:  a Google search reveals that MacOS 8 (or I think even MacOS
9 with a filesystem created originally under MacOS 8) has an even
shorter limit, 31 characters.  That makes this issue even more
pressing, since MacOS 8 is really not old enough that I want to tell
its users to go away if we can avoid it.  As noted, I don't mind if
StuffIt will truncate the names for them; can someone confirm it can
do that?  Mark?  Andy?

Of course we've already given up on sticking to 8.3 for DOS's sake.  I
deemed that Not Worth It on the grounds that most DOS users these days
have another OS also or else they're not downloading new stuff like
clipart libraries, and pkunzip 2.5 and later will truncate the
filenames and has command-line options for listing the contents and
for extracting individual files, so even if two files share the same
first eight characters and extension, they can still get both files,
with a little effort.

-- 
$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}}
split//,"ten.thgirb\@badanoj$/ --";$\=$ ;-> ();print$/




More information about the clipart mailing list