Raw and Escaped String Literals
Start up the Glasgow Haskell Interpreter and try to display a typical Windows file path:
Prelude> putStrLn "C:\test\x41.txt"
C: estA.txt
The result has been mangled. The problem is that, by default, Haskell interprets escape codes in string literals. The '\t' has been interpreted as a TAB character and the '\x41' has been interpreted as the ASCII code for 'A'. This interpretation process can even give compile time errors:
Prelude> putStrLn "C:\work\x41.txt"
<interactive>:1:14: lexical error in string/character literal at character 'o'
In Haskell there is only one way to stop your strings from being mangled, and that is to mangle them yourself by inserting extra escape characters to prevent the backslash characters from being interpreted:
Prelude> putStrLn "C:\\test\\x41.txt"
C:\test\x41.txt
Prelude> putStrLn "C:\\work\\x41.txt"
C:\work\x41.txt
This is all rather messy and unsatisfactory. There must be a better way.
In Python strings suffer from similar problems:
>>> print "C:\test\x41.txt"
C: estA.txt
Python is not as strict as Haskell about unrecognized escape codes:
>>> print "C:\work\x41.txt"
C:\workA.txt
But in Python you can prevent the interpretation of escape characters by putting an 'r', for 'raw string', immediately in front of the string:
>>> print r"C:\test\x41.txt"
C:\test\x41.txt
Unfortunately, the fact that the underlying implementation is escaped strings still leaks out in that you can't have a backslash as the last character in a raw string:
>>> print r"C:\test\"
Line 1
print r"C:\test\"
^
SyntaxError: EOL while scanning single-quoted string
Python guru Fredrik Lundh gives more explanation of this problem and how to work round it here.
However, there would be no need for all these explanations and work-arounds if raw, uninterpreted strings were the default and you had to indicate explicitly when a string was to have its escape characters interpreted. In Haskell the latter could be implemented as a function esc :: String -> String. Then the simple unadorned string would do what was expected:
Prelude> putStrLn "C:\test\x41.txt"
C:\test\x41.txt
And to deal with those special cases where you really need escape characters you would explicitly call the function to do it:
Prelude> putStrLn (esc "She said \"Hi!\"")
She said "Hi!"
Explicit is better than implicit, especially when it leads to less confusion.
Reader Comments