找了好久的东西,一看就明白~~~
jv-convert --from gb2312 --to UTF8 -i gb2312.txt -o UTF8.txt
1] Change '&'#XXXXX;' to Utf-8:
perl -p -e 's/&#(.....);/pack("U", )/eg'
2] Change \xb?\xb? to GB2312
perl -p -e 's/\x(..)/pack("c", hex())/eg'
3] Change %C2%D2%C2%D7 to gb2312
perl -p -e 's/%(..)/pack("c", hex())/eg'
JV-CONVERT(1) GNU JV-CONVERT(1)
NAME
jv-convert - Convert file from one encoding to another
SYNOPSIS
jv-convert [OPTION] ... [INPUTFILE [OUTPUTFILE]]
DESCRIPTION
jv-convert is a utility included with "libgcj" which converts a file
from one encoding to another. It is similar to the Unix iconv utility.
The encodings supported by jv-convert are platform-dependent. Cur-
rently there is no way to get a list of all supported encodings.
OPTIONS
--encoding name
--from name
Use name as the input encoding. The default is the current
locale鈥檚 encoding.
--to name
Use name as the output encoding. The default is the "JavaSrc"
encoding; this is ASCII with \u escapes for non-ASCII characters.
-i file
Read from file. The default is to read from standard input.
-o file
Write to file. The default is to write to standard output.
--reverse
Swap the input and output encodings.
--help
Print a help message, then exit.
--version
Print version information, then exit.
SEE ALSO
COPYRIGHT
Copyright (c) 2001, 2002 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with the
Invariant Sections being 鈥樷