Archivo para julio, 2010

How to Use UTF-8 with Python

How to Use UTF-8 with Python

http://evanjones.ca/python-utf8.html

How to Use UTF-8 with Python

[
Path: > Evan Jones' Scratch Pad | Written by Evan Jones
]

[
2005-October-01 20:15 ]

Tim Bray describes why Unicode and UTF-8 are wonderful much better than I could, so go read that for an overview of what Unicode is, and why all your programs should support it. What I’m going to tell you is how to use Unicode, and specifically UTF-8, with one of the coolest programming languages, Python, but I have also written an introduction to Using Unicode in C/C++. Python has good support for Unicode, but there are a few tricks that you need to be aware of. I spent more than a few hours learning these tricks, and I’m hoping that by reading this you won’t have to. This is a very quick and dirty introduction. If you need in depth knowledge, or need to learn about Unicode in Java or Windows, see Unicode for Programmers. [Updated 2005-09-01: Updated
information about XML encoding declarations.]

The Basics

There are two types of strings in Python: byte strings and Unicode strings. As you may have guessed, a byte string is a sequence of bytes. When needed, Python uses your computer’s default locale to convert the bytes into characters. On Mac OS X, the default locale is actually UTF-8, but everywhere else, the default is probably ASCII. This creates a byte string:

byteString = "hello world! (in my default locale)"

And this creates a Unicode string:

unicodeString = u"hello Unicode world!"

Convert a byte string into a Unicode string and back again:

s = "hello byte string"
u = unicode( s )
backToBytes = u.encode()

The previous code uses your default character set to perform the conversions. However, relying on the locale’s character set is a bad idea, since your application is likely to break as soon as someone from Thailand tries to run it on their computer. In most cases it is probably better to explicitly specify the encoding of the string:

s = "hello normal string"
u = unicode( s, "utf-8" )
backToBytes = u.encode( "utf-8" )

Now, the byte string s will be treated as a sequence of UTF-8 bytes to create the Unicode string u. The next line stores the UTF-8 representation of u in the byte string backToBytes.

Working With Unicode Strings

Thankfully, everything in Python is supposed to treat Unicode strings identically to byte strings. However, you need to be careful in your own code when testing to see if an object is a string. Do not do this:

if isinstance( s, str ): # BAD: Not true for Unicode strings!

Instead, use the generic string base class, basestring:

if isinstance( s, basestring ): # True for both Unicode and byte strings

Reading UTF-8 Files

You can manually convert strings that you read from files, however there is an easier way:

import codecs
fileObj = codecs.open( "someFile", "r", "utf-8" )
u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in the file

The codecs module will take care of all the conversions for you. You can also open a file for writing and it will convert the Unicode strings you pass in to write into whatever encoding you have chosen. However, take a look at the note below about the byte-order marker (BOM).

Working with XML and minidom

I use the minidom module for my XML needs mostly because I am familiar with it. Unfortunately, it only handles byte strings so you need to encode your Unicode strings before passing them to minidom functions. For example:

import xml.dom.minidom
xmlData = u"<français>Comment ça va ? Très bien ?</français>"
dom = xml.dom.minidom.parseString( xmlData )

The last line raises an exception: UnicodeEncodeError: ‘ascii’ codec can’t encode character ‘\ue7′ in position 5: ordinal not in range(128). To work around this error, encode the Unicode string into the appropriate format before passing it to minidom, like this:

import xml.dom.minidom
xmlData = u"<français>Comment ça va ? Très bien ?</français>"
dom = xml.dom.minidom.parseString( xmlData.encode( "utf-8" ) )

Minidom can handle any format of byte string, such as Latin-1 or UTF-16. However, it will only work reliably if the XML document has an encoding declaration (eg. <?xml version="1.0" encoding="Latin-1"?>). If the encoding declaration is missing, minidom assumes that it is UTF-8. In is a good habit to include an encoding declaration on all your XML documents, in order to guarantee compatability on all systems.

When you get XML out of minidom by calling dom.toxml() or dom.toprettyxml(), minidom returns a Unicode string. You can also pass in an additional encoding="utf-8" parameter to get an encoded byte string, perfect for writing out to a file.

The Byte-Order Marker (BOM)

UTF-8 files sometimes start with a byte-order marker (BOM) to indicate that they are encoded in UTF-8. This is commonly used on Windows. On Mac OS X, applications (eg. TextEdit) ignore the BOM and remove it if the file is saved again. The W3C HTML Validator warns that older applications may not be able to handle the BOM. Unicode effectively ignores the marker, so it should not matter when reading the file. You may wish to add this to the beginning of your files to determine if they are encoded in ASCII or UTF-8. The codecs module provides the constant for you to do this:

out = file( "someFile", "w" )
out.write( codecs.BOM_UTF8 )
out.write( unicodeString.encode( "utf-8" ) )
out.close()

You need to be careful when using the BOM and UTF-8. Frankly, I think this is a bug in Python, but what do I know. Python will decode the value of the BOM into a Unicode character, instead of ignoring it. For example (tested with Python 2.3):

>>> codecs.BOM_UTF16.decode( "utf16" )
u”
>>> codecs.BOM_UTF8.decode( "utf8" )
u’\ufeff’

For UTF-16, Python decoded the BOM into an empty string, but for UTF-8, it decoded it into a character. Why is there a difference? I think the UTF-8 decoder should do the same thing as the UTF-16 decoder and strip out the BOM. However, it doesn’t, so you will probably need to detect it and remove it yourself, like this:

import codecs
if s.beginswith( codecs.BOM_UTF8 ):
# The byte string s begins with the BOM: Do something.
# For example, decode the string as UTF-8

if u[0] == unicode( codecs.BOM_UTF8, "utf8" ):
# The unicode string begins with the BOM: Do something.
# For example, remove the character.

# Strip the BOM from the beginning of the Unicode string, if it exists
u.lstrip( unicode( codecs.BOM_UTF8, "utf8" ) )

Writing Python Scripts in Unicode

As you may have noticed from the examples on this page, you can actually write Python scripts in UTF-8. Variables must be in ASCII, but you can include Chinese comments, or Korean strings in your source files. In order for this to work correctly, Python needs to know that your script file is not ASCII. You can do this in one of two ways. First, you can place a UTF-8 byte-order marker at the beginning of your file, if your editor supports it. Secondly, you can place the following special comment in the first or second lines of your script:

# -*- coding: utf-8 -*-

Any ASCII-compatible encoding is permitted. For details, see the Defining Python Source Code Encodings specification.

www.evernote.com | Remember everything | Sign up for free

julio 20, 2010 at 12:06 am Deja un comentario

Easy install

Easy install
Easy install

C:\Program Files\Dev\Plone 3\Python\Scripts>python easy_install-2.4-script.py imsvdex

www.evernote.com | Remember everything | Sign up for free

julio 20, 2010 at 12:06 am Deja un comentario

Plone:Customizing a layer

Plone:Customizing a layer
Plone:Customizing a layer

julio 20, 2010 at 12:06 am Deja un comentario

Palm media players

En aquellos tiempos… así me fue con los reproductores de media para PalmOS
Palm media player
!DioPlayer
* works NOT w/phone (must sometimes reset device on incoming call)
* no manual screen off (just timeout)
* cool playlist in-site drag n drop mgmt
* eq hogs cpu down

!Busker
* odd navigation; seems cool, but gets ugly with many artists
* works w/phone (must push play again)
* once .. from /audio cannot go back
* shows covers (even may grab them from the net)

!Mmplayer
* best navigation
* does video too
* BEST tech oriented software
* in-place playlist mgmt (button up/dn)
* eq
* preamp
* cannot manually turn screen off

!Real
* works w/phone (must push play again, STOPS, does not pause)
* flat song list (song & playlist mode)… Useless for long lists
* can switch screen off

!Ptunes 3
* fine bass boost
* works w/phone
* can switch screen off
* eq
* fixed pre amp (3 levels) does the trick

!Ptunes 4
* MPC sync means no extra software
* nice ipod-esque navigation w/ MPC, as well as decent filesystem nav (/audio)
* can switch screen off

!TCPMP
* best compatibility (image, video, audio:
* correct file system mgmt, w/checkboxes to select even dirs
* cannot order playlist
* NO BACKGROUND MODE
* does .m3u
* does covers
* can switch screen off
* eq lots of bands
* good preamp

julio 20, 2010 at 12:06 am Deja un comentario

Plone: change home

Plone: change home

http://n2.nabble.com/Changing-the-default-plone-Home-page-td358151.html

Plone: change home

If you want the news instead of the default front page you can go to ZMI > your site >Properties tab> default page.

You can use an actual Plone object or a browser .pt page.

www.evernote.com | Remember everything | Sign up for free

julio 20, 2010 at 12:06 am Deja un comentario

Varios OWB

Bugs de mapping editor
Bugs de mapping editor
A veces se corrompe el mapping con

"VLD-1141: Internal error during mapping generation."

A mí me pasó con un mapa complejo (tabla de hechos -staging- de GDF).

En
http://forums.oracle.com/forums/thread.jspa?threadID=449871&tstart=0

se ve que es un bug, con Bug id 5689223.

BRONCA DE SYNCHRONIZE
1. Borro y recreo una dimensiónm con su tabla
2. Abro el mapping y le digo que sync > outbound
3. re-deploy tabla: me dice que ya existe. en fin… re-create
4. lo mismo con la dimensión
5. intento ejecutar el mapping y error
6. resulta que un nombre de columna que había cambiado en la tabla original no se sincronizo cuando hice el sync…

julio 20, 2010 at 12:03 am Deja un comentario

OBIEE Usuarios y Grupos

1. Creo el usuario y el grupo con OBI Administration Tool. OJO "test" no quiso, "test_usr" si. (??? Tal vez porque no asigné grupo al usuario ni filtro)

2. restart OBI Server

2. Creo el grupo en OBI Presentation Services (web)

4. Me logeo con el usuario nuevo

5. me salgo

6. me logeo como admin

7. ahora si puedo ver al usaurio; lo agrego al gurupo (web)

le puedo poner filtros en OBI admin tool y jalan!

julio 20, 2010 at 12:02 am Deja un comentario

OBIEE WEB_HOME

Los archivos de contenido web estático están en

C:\OracleBI\oc4j_bi\j2ee\home\applications\analytics\analytics

julio 20, 2010 at 12:01 am Deja un comentario

Varios Informix

Instalación i.Reach (NT)
Instalación i.Reach (NT)
1. Crear base de datos c/dbAccess, poniéndole buffered logging. Crearla como INFORMIX:INFORMIX, no como Administrador. Ojo con el espacio en el dbspace
CREATE DATABASE ireach IN ol_ireach WITH LOG;

2. Crear sbspace:
onspaces -c -S sbireach -p c:\informix\dbspaces\sbireach -o 0 -s 100000 (tamaño en kb) -Df "LOGGING=ON"

3. Registrar a la base de datos los blades (atención con logs):
· etx
· ifxbuiltins
· lld
· TXT
· web 3.32TC4

4. Registrar el abp schema: correr c:\informix\extend\web.3.32.TC4\apb\schema.bat ireach sbireach

5. Instalar drvisapi.dll (o el driver correspondiente al web server).

NOTA: Si debe coexistir con una instalación de WDB 4 con ISAPI o NSAPI, conviene instalar WDB 3 como CGI, para evitar conflictos con el formato de web.cnf para múltiples configuraciones.

6. Crear el stored procedure isvaliddatetime (leer la nota para winNT)

7. Instalar iReach (ojo con el log)

8. Hacer el web.cnf a partir del ejemplo en el dir de instalación

IFX: Logs etc
IFX: Logs etc
onTrabado:
onstat -l
si ningun log tiene flag ‘b’ y todos están al 100%:
ontape -a
IDS
IDS

select procname, created
from sysprocedures A, sysprocplan B
where A.procid = B.procid;

IFX: unload to <path> delimiter ‘:’
IFX: unload to <path> delimiter ‘:’
select…
IFX: Onspaces -c -s size -p path -Df ”logging=on”,
IFX: Onspaces -c -s size -p path -Df ”logging=on”,
blademgr
list dbname
reg web… dname
ifx: logs
ifx: logs
Nasis.nrcs.usda.gov/archive/logbuf.html
www.evernote.com | Remember everything | Sign up for free

julio 20, 2010 at 12:00 am Deja un comentario

.BAT para iniciar Plone en Debug

.BAT para iniciar Plone en Debug

https://svn.restauranteando.com/bin/s.bat

.BAT para iniciar Plone en Debug

Llamarlo "s.bat" para así responder "terminar trabajo por lotes: S" y tener la S en el histórico de la línea de comandos

@set PYTHON=C:Program FilesDevPlone 3Pythonpython.exe
@set ZOPE_HOME=C:Program FilesDevPlone 3Zope
@set INSTANCE_HOME=C:UsersALeXPlonesR10
@set SOFTWARE_HOME=C:Program FilesDevPlone 3Zopelibpython
@set CONFIG_FILE=C:UsersALeXPlonesR10etczope.conf
@set PYTHONPATH=%INSTANCE_HOME%libpython;%SOFTWARE_HOME%;%PYTHONPATH%
@set ZOPE_RUN=C:Program FilesDevPlone 3ZopelibpythonZope2Startuprun.py
"%PYTHON%" "%ZOPE_RUN%" -C "%CONFIG_FILE%" -X "debug-mode=on"

www.evernote.com | Remember everything | Sign up for free

julio 19, 2010 at 11:58 pm Deja un comentario

Entradas antiguas


Calendario

julio 2010
L M X J V S D
« mar   ago »
 1234
567891011
12131415161718
19202122232425
262728293031  

Entradas por Mes

Entradas por Categoría


Seguir

Get every new post delivered to your Inbox.