Opened 16 years ago

Closed 16 years ago

#1 closed defect (fixed)

non-ASCII characters in darcs output cause a crash

Reported by: warner Owned by: somebody
Priority: major Milestone:
Component: component1 Version:
Keywords: Cc:
Launchpad Bug:

Description

'ndurner' noticed darcsver crashing, due to a non-ascii character in the
output of 'darcs changes --xml-format'. It looks like the german windows
machine emitted a 'local_date' attribute with a long timezone name, something
like "Westeuropaische Normalzeit", except using an a-with-umlaut in the first
word. It looks like the name was encoded with Latin-1.

darcsver has a hack to discard funny-looking characters before it passes the
string to the XML parser, because apparently it's awfully hard to get darcs
to declare a character encoding for its XML output, or for darcs to stick to
that encoding (the local_date string is probably coming from some windows
time/date library, and who knows how to control the encoding *that* uses).
But the hack doesn't discard enough.

My suggestion is to discard everything that isn't ASCII:

allbadchars = "".join([chr(i) for i in range(0x20) + range(0x7f,0x100)])
tt = string.maketrans(allbadchars, "?"*len(allbadchars))

(really, we could probably discard everything that isn't an angle bracket or
the word "patch", since all darcsver really cares about is how many
<patch??? tokens appear in the file)

Change History (4)

comment:1 Changed 16 years ago by warner

oops, it might be nice to preserve newlines. how about

"".join([chr(i) for i in range(0x0a) + [0x0b, 0x0c] + range(0x0e, 0x20) + range(0x7f,0x100)])

also, since "?" in an XML file is special (as in <?xml>), how about translating those characters to something else, like "-"

I think ndurner reports that this seemed to work.

comment:2 Changed 16 years ago by ndurner

Yes, it works.

comment:3 Changed 16 years ago by zooko

Fixed by [20090211201316-92b7f-e014f2023111b36590bd5ac8336aebe4ad8f491c]. Thanks folks! See also #2, which will make parsing XML output unnecessary.

comment:4 Changed 16 years ago by zooko

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.