![]() unicode_escape needs to start with a bytes in order to process the escape sequences (the other way around, it adds them) and then it will treat the resulting \xc3 and \xa1 as character escapes rather than byte escapes. In 3.x, the string_escape codec is replaced with unicode_escape, and it is strictly enforced that we can only encode from a str to bytes, and decode from bytes to str. To get a unicode result, decode again with UTF-8. The result is a str that is encoded in UTF-8 where the accented character is represented by the two bytes that were written \\xc3\\xa1 in the original string. > print 'Capit\\xc3\\xa1n\n'.decode('string_escape') In 2.x, a string that actually contains these backslash-escape sequences can be decoded using the string_escape codec: # Python 2.x Instead, just input characters like á in the editor, which should then handle the conversion to UTF-8 and save it. We can see this by displaying the result: # Python 3.x - reading the file as bytes rather than text, Those are 8 bytes and the code reads them all. Writing Capit\xc3\xa1n into the file in a text editor means that it actually contains \xc3\xa1. \x is an escape sequence, indicating that e1 is in hexadecimal. In the notation u'Capit\xe1n\n' (should be just 'Capit\xe1n\n' in 3.x, and must be in 3.0 and 3.1), the \xe1 represents just one character. > print > file('f3','w'), simplejson.dumps(ss) Maybe I should just JSON dump the string, and use that instead, since that has an asciiable representation! More to the point, is there an ASCII representation of this Unicode object that Python will recognize and decode, when coming in from a file? If so, how do I get it? > print simplejson.dumps(ss) What I'm truly failing to grok here, is what the point of the UTF-8 representation is, if you can't actually get Python to recognize it, when it comes from outside. What does one type into text files to get proper conversions? What am I not understanding here? Clearly there is some vital bit of magic (or good sense) that I'm missing. ![]() So I type in Capit\xc3\xa1n into my favorite editor, in file f2. # The string, which has an a-acute in it. The reason for this is that textfiles are not designed for random access.I'm having some brain failure in understanding reading and writing text to a file (Python 2.4). But if something goes wrong in the middle you will have no old input file, because you deleted it, and no new output file either, because you didn't successfully write it. Then you don't have to delete the old copy. The other alternative is to read the input file into memory, close and reopen it for writing, then write out a new version of the file. If that is troublesome, take a look at the module in-place. To do what I think you want, you need to open the existing file in read mode, and open another, new file in write mode, and copy the data from the one to the other.Īfter that you have two files so you have to take care of deleting the old one. You can use a to add new stuff to the end but that is about it. Textfiles in general can't be updated in place. Specifying both is asking open to point to two different locations in the file at the same time. ![]() The a or the r specifies a seek to a particular location in the file. Either you want to read it or you want to write to it. Here's a nice little diagram from another SO post: You can also use w+, but this will truncate (delete) all the existing content. (Further Reading: What's the difference between 'r+' and 'a+' when open file in python?) with open("filename", "r+") as f:į.seek(0) # return to the top of the file If you ever need to do an entire reread, you could return to the starting position by doing f.seek(0). # after reading, the position is pushed toward the end ![]() # here, position is initially at the beginning ![]() With a+, the position is initially at the end. With r+, the position is initially at the beginning, but reading it once will push it towards the end, allowing you to append. You're looking for the r+/ a+/ w+ mode, which allows both read and write operations to files. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |