GNU/Linux |
CentOS 5.3 |
|
![]() |
nkf(1) |
![]() |
nkf − Network Kanji Filter
nkf B<[-butjnesliohrTVvwWJESZxXFfmMBOcdILg]> B<[>I<file ...>B<]>
Nkf is a yet another kanji code converter among networks, hosts and terminals. It converts input kanji code to designated kanji code such as ISO−2022−JP , Shift_JIS, EUC−JP , UTF−8 or UTF−16 .
One of the most unique faculty of nkf is the guess of the input kanji encodings. It currently recognizes ISO−2022−JP , Shift_JIS, EUC−JP , UTF−8 and UTF−16 . So users needn’t set the input kanji code explicitly.
By default, X0201 kana is converted into X0208 kana. For X0201 kana, SO/SI , SSO and ESC− (−I methods are supported. For automatic code detection, nkf assumes no X0201 kana in Shift_JIS. To accept X0201 in Shift_JIS, use −X, −x or −S.
−b −u
Output is buffered ( DEFAULT ), Output is unbuffered.
−j −s −e −w −w16
Output code is ISO−2022−JP (7bit JIS ), Shift_JIS, EUC−JP , UTF−8N , UTF−16BE . Without this option and compile option, ISO−2022−JP is assumed.
−J −S −E −W −W16
Input assumption is JIS 7 bit, Shift_JIS, EUC−JP , UTF−8 , UTF−16LE .
−J |
Assume JIS input. It also accepts EUC−JP . This is the default. This flag does not exclude Shift_JIS. | ||
−S |
Assume Shift_JIS and X0201 kana input. It also accepts JIS . EUC-JP is recognized as X0201 kana. Without −x flag, X0201 kana (halfwidth kana) is converted into X0208. | ||
−E |
Assume EUC-JP input. It also accepts JIS . Same as −J. | ||
−t |
No conversion.
−i[@B]
Specify the Esc Seq for JIS X 0208−1978/83. ( DEFAULT B)
−o[ BJH ]
Specify the Esc Seq for ASCII/Roman. ( DEFAULT B)
−r |
{de/en}crypt ROT13/47 |
−h[123] −−hiragana −−katakana −−katakana−hiragana
−h1 −−hiragana
Katakana to Hiragana conversion.
−h2 −−katakana
Hiragana to Katakana conversion.
−h3 −−katakana−hiragana
Katakana to Hiragana and Hiragana to Katakana conversion.
−T |
Text mode output ( MS−DOS ) |
|||
−l |
ISO8859−1 (Latin−1) support |
−f[m [− n]]
Folding on m length with n margin in a line. Without this option, fold length is 60 and fold margin is 10.
−F |
New line preserving line folding. |
−Z[0−3]
Convert X0208 alphabet
(Fullwidth Alphabets) to ASCII .
−Z −Z0
Convert X0208 alphabet to ASCII .
−Z1 |
Converts X0208 kankaku to single ASCII space. | ||
−Z2 |
Converts X0208 kankaku to double ASCII spaces. | ||
−Z3 |
Replacing Fullwidth >, <, ", & into ’>’, ’<’, ’"’, ’&’ as in HTML . |
−X −x
Assume X0201 kana in MS−Kanji. With −X or without this option, X0201 is converted into X0208 Kana. With −x, try to preserve X0208 kana and do not convert X0201 kana to X0208. In JIS output, ESC− (−I is used. In EUC output, SSO is used.
−B[0−2]
Assume broken JIS-Kanji input, which lost ESC . Useful when your site is using old B−News Nihongo patch.
−B1 |
allows any char after ESC− ( or ESC−$ . |
|||
−B2 |
forces ASCII after NL . |
|||
−I |
Replacing non iso−2022−jp char into a geta character (substitute character in Japanese).
−m[ BQN0 ]
MIME ISO−2022−JP/ISO8859−1 decode. ( DEFAULT ) To see ISO8859−1 (Latin−1) −l is necessary.
−mB |
Decode MIME base64 encoded stream. Remove header or other part before conversion. | ||
−mQ |
Decode MIME quoted stream. ’_’ in quoted stream is converted to space. | ||
−mN |
Non-strict decoding. It allows line break in the middle of the base64 encoding. | ||
−m0 |
No MIME decode. | ||
−M |
MIME encode. Header style. All ASCII code and control characters are intact.
−MB |
MIME encode Base64 stream. Kanji conversion is performed before encoding, so this cannot be used as a picture encoder. | ||
−MQ |
Perfome quoted encoding. | ||
−l |
Input and output code is ISO8859−1 (Latin−1) and ISO−2022−JP . −s, −e and −x are not compatible with this option.
−L[uwm] −d −c
Convert line breaks.
−Lu −d
unix ( LF )
−Lw −c
windows ( CRLF )
−Lm |
mac ( CR ) |
Without this option, nkf doesn’t convert line breaks.
−−fj −−unix −−mac −−msdos −−windows
convert for these system
−−jis −−euc −−sjis −−mime −−base64
convert for named code
−−jis−input −−euc−input −−sjis−input −−mime−input −−base64−input
assume input system
−−ic=input codeset −−oc=output codeset
Set the input or output
codeset. NKF supports following codesets and
those codeset name are case insensitive.
ISO−2022−JP
a.k.a. RFC1468 , 7bit JIS , JUNET
EUC-JP (eucJP−nkf)
a.k.a. AT&T JIS , Japanese EUC , UJIS
eucJP-ascii
eucJP-ms
CP51932
Microsoft Version of EUC−JP .
Shift_JIS
a.k.a. SJIS , MS-Kanji
CP932
a.k.a. Windows−31J
UTF−8
same as UTF−8N
UTF−8N
UTF−8 without BOM
UTF−8−BOM
UTF−8 with BOM
UTF−16
same as UTF−16BE
UTF−16BE
UTF−16 Big Endian without BOM
UTF−16BE−BOM
UTF−16 Big Endian with BOM
UTF−16LE
UTF−16 Little Endian without BOM
UTF−16LE−BOM
UTF−16 Little Endian with BOM
UTF8−MAC (input only)
−−fb−{skip, html, xml, perl, java, subchar}
Specify the way that nkf handles unassigned characters. Without this option, −−fb−skip is assumed.
−−prefix=escape charactertarget character..
When nkf converts to Shift_JIS, nkf adds a specified escape character to specified 2nd byte of Shift_JIS characters. 1st byte of argument is the escape character and following bytes are target characters.
−−no−cp932ext
Handle the characters extended in CP932 as unassigned characters.
−−no−best−fit−chars
When Unicode to Encoded byte conversion, don’t convert characters which is not round trip safe. When Unicode to Unicode conversion, with this and −x option, nkf can be used as UTF converter. (In other words, without this and −x option, nkf doesn’t save some characters)
When nkf convert string which related to path, you should use this opion.
−−cap−input
Decode hex encoded characters.
−−url−input
Unescape percent escaped characters.
−−numchar−input
Decode character reference, such as "&#....;".
−−in−place[= SUFFIX ] −−overwrite[= SUFFIX ]
Overwrite original listed files by filtered result.
Note −−overwrite preserves timestamp of original files.
−−guess
Print guessed encoding.
−−help
Print nkf’s help.
−−version
Print nkf’s version.
−− |
Ignore rest of −option. |
Copyright (C) 1987, FUJITSU LTD . (I.Ichikawa),2000 S. Kono, COW Copyright (C) 2002−2006 Kono, Furukawa, Naruse, mastodon
![]() |
nkf(1) | ![]() |