[Linux]编码转换的好工具[转载]

发表于:2007-06-09来源:作者:点击数: 标签:
找了好久的东西,一看就明白~~~ jv-convert --from gb2312 --to UTF8 -i gb2312.txt -o UTF8.txt 1] Change ''#XXXXX;' to Utf-8: perl -p -e 's/#(.....);/pack(U, )/eg' 2] Change \xb?\xb? to GB2312 perl -p -e 's/\x(..)/pack(c, hex())/eg' 3] Change

找了好久的东西,一看就明白~~~

jv-convert --from gb2312 --to UTF8 -i gb2312.txt -o UTF8.txt


1] Change '&'#XXXXX;' to Utf-8:
perl -p -e 's/&#(.....);/pack("U", )/eg'

2] Change \xb?\xb? to GB2312
perl -p -e 's/\x(..)/pack("c", hex())/eg'

3] Change %C2%D2%C2%D7 to gb2312
perl -p -e 's/%(..)/pack("c", hex())/eg'

JV-CONVERT(1)                         GNU                        JV-CONVERT(1)

NAME
       jv-convert - Convert file from one encoding to another

SYNOPSIS
       jv-convert [OPTION] ... [INPUTFILE [OUTPUTFILE]]

DESCRIPTION
       jv-convert is a utility included with "libgcj" which converts a file
       from one encoding to another.  It is similar to the Unix iconv utility.

       The encodings supported by jv-convert are platform-dependent.  Cur-
       rently there is no way to get a list of all supported encodings.

OPTIONS
       --encoding name
       --from name
           Use name as the input encoding.  The default is the current
           locale鈥檚 encoding.

       --to name
           Use name as the output encoding.  The default is the "JavaSrc"
           encoding; this is ASCII with \u escapes for non-ASCII characters.

       -i file
           Read from file.  The default is to read from standard input.

       -o file
           Write to file.  The default is to write to standard output.

       --reverse
           Swap the input and output encodings.

       --help
           Print a help message, then exit.

       --version
           Print version information, then exit.

SEE ALSO
COPYRIGHT
       Copyright (c) 2001, 2002 Free Software Foundation, Inc.

       Permission is granted to copy, distribute and/or modify this document
       under the terms of the GNU Free Documentation License, Version 1.2 or
       any later version published by the Free Software Foundation; with the
       Invariant Sections being 鈥樷

原文转自:http://www.ltesting.net