Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
  /*
   * Licensed to the Apache Software Foundation (ASF) under one or more
   * contributor license agreements.  See the NOTICE file distributed with
   * this work for additional information regarding copyright ownership.
   * The ASF licenses this file to You under the Apache License, Version 2.0
   * (the "License"); you may not use this file except in compliance with
   * the License.  You may obtain a copy of the License at
   * 
   *      http://www.apache.org/licenses/LICENSE-2.0
  * 
  * Unless required by applicable law or agreed to in writing, software
  * distributed under the License is distributed on an "AS IS" BASIS,
  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
 
 package org.apache.catalina.util;
 
 import java.util.*;

MIME2Java is a convenience class which handles conversions between MIME charset names and Java encoding names.

The supported XML encodings are the intersection of XML-supported code sets and those supported in JDK 1.1.

MIME charset names are used on xmlEncoding parameters to methods such as TXDocument#setEncoding and DTD#setEncoding.

Java encoding names are used on encoding parameters to methods such as TXDocument#printWithFormat and DTD#printExternal.

Common Name

Use this name in XML files

Name Type

Xerces converts to this Java Encoder Name

8 bit Unicode

UTF-8

IANA

UTF8

ISO Latin 1

ISO-8859-1

MIME

ISO-8859-1

ISO Latin 2

ISO-8859-2

MIME

ISO-8859-2

ISO Latin 3

ISO-8859-3

MIME

ISO-8859-3

ISO Latin 4

ISO-8859-4

MIME

ISO-8859-4

ISO Latin Cyrillic

ISO-8859-5

MIME

ISO-8859-5

ISO Latin Arabic

ISO-8859-6

MIME

ISO-8859-6

ISO Latin Greek

ISO-8859-7

MIME

ISO-8859-7

ISO Latin Hebrew

ISO-8859-8

MIME

ISO-8859-8

ISO Latin 5

ISO-8859-9

MIME

ISO-8859-9

EBCDIC: US

ebcdic-cp-us

IANA

cp037

EBCDIC: Canada

ebcdic-cp-ca

IANA

cp037

EBCDIC: Netherlands

ebcdic-cp-nl

IANA

cp037

EBCDIC: Denmark

ebcdic-cp-dk

IANA

cp277

EBCDIC: Norway

ebcdic-cp-no

IANA

cp277

EBCDIC: Finland

ebcdic-cp-fi

IANA

cp278

EBCDIC: Sweden

ebcdic-cp-se

IANA

cp278

EBCDIC: Italy

ebcdic-cp-it

IANA

cp280

EBCDIC: Spain, Latin America

ebcdic-cp-es

IANA

cp284

EBCDIC: Great Britain

ebcdic-cp-gb

IANA

cp285

EBCDIC: France

ebcdic-cp-fr

IANA

cp297

EBCDIC: Arabic

ebcdic-cp-ar1

IANA

cp420

EBCDIC: Hebrew

ebcdic-cp-he

IANA

cp424

EBCDIC: Switzerland

ebcdic-cp-ch

IANA

cp500

EBCDIC: Roece

ebcdic-cp-roece

IANA

cp870

EBCDIC: Yogoslavia

ebcdic-cp-yu

IANA

cp870

EBCDIC: Iceland

ebcdic-cp-is

IANA

cp871

EBCDIC: Urdu

ebcdic-cp-ar2

IANA

cp918

Chinese for PRC, mixed 1/2 byte

gb2312

MIME

GB2312

Extended Unix Code, packed for Japanese

euc-jp

MIME

eucjis

Japanese: iso-2022-jp

iso-2020-jp

MIME

JIS

Japanese: Shift JIS

Shift_JIS

MIME

SJIS

Chinese: Big5

Big5

MIME

Big5

Extended Unix Code, packed for Korean

euc-kr

MIME

iso2022kr

Cyrillic

koi8-r

MIME

koi8-r

Author(s):
TAMURA Kent <kent@trl.ibm.co.jp>
Version:
$Revision: 1473 $ $Date: 2010-05-17 19:46:58 +0200 (Mon, 17 May 2010) $
public class MIME2Java {
    static private Hashtable s_enchash;
    static private Hashtable s_revhash;
    static {
         = new Hashtable();
        //    <preferred MIME name>, <Java encoding name>
        .put("UTF-8""UTF8");
        .put("US-ASCII",        "8859_1");    // ?
        .put("ISO-8859-1",      "8859_1");
        .put("ISO-8859-2",      "8859_2");
        .put("ISO-8859-3",      "8859_3");
        .put("ISO-8859-4",      "8859_4");
        .put("ISO-8859-5",      "8859_5");
        .put("ISO-8859-6",      "8859_6");
        .put("ISO-8859-7",      "8859_7");
        .put("ISO-8859-8",      "8859_8");
        .put("ISO-8859-9",      "8859_9");
        .put("ISO-2022-JP",     "JIS");
        .put("SHIFT_JIS",       "SJIS");
        .put("EUC-JP",          "EUCJIS");
        .put("GB2312",          "GB2312");
        .put("BIG5",            "Big5");
        .put("EUC-KR",          "KSC5601");
        .put("ISO-2022-KR",     "ISO2022KR");
        .put("KOI8-R",          "KOI8_R");
        .put("EBCDIC-CP-US",    "CP037");
        .put("EBCDIC-CP-CA",    "CP037");
        .put("EBCDIC-CP-NL",    "CP037");
        .put("EBCDIC-CP-DK",    "CP277");
        .put("EBCDIC-CP-NO",    "CP277");
        .put("EBCDIC-CP-FI",    "CP278");
        .put("EBCDIC-CP-SE",    "CP278");
        .put("EBCDIC-CP-IT",    "CP280");
        .put("EBCDIC-CP-ES",    "CP284");
        .put("EBCDIC-CP-GB",    "CP285");
        .put("EBCDIC-CP-FR",    "CP297");
        .put("EBCDIC-CP-AR1",   "CP420");
        .put("EBCDIC-CP-HE",    "CP424");
        .put("EBCDIC-CP-CH",    "CP500");
        .put("EBCDIC-CP-ROECE""CP870");
        .put("EBCDIC-CP-YU",    "CP870");
        .put("EBCDIC-CP-IS",    "CP871");
        .put("EBCDIC-CP-AR2",   "CP918");
                                                // j:CNS11643 -> EUC-TW?
                                                // ISO-2022-CN? ISO-2022-CN-EXT?
         = new Hashtable();
        //    <Java encoding name>, <preferred MIME name>
        .put("UTF8""UTF-8");
        //s_revhash.put("8859_1", "US-ASCII");    // ?
        .put("8859_1""ISO-8859-1");
        .put("8859_2""ISO-8859-2");
        .put("8859_3""ISO-8859-3");
        .put("8859_4""ISO-8859-4");
        .put("8859_5""ISO-8859-5");
        .put("8859_6""ISO-8859-6");
        .put("8859_7""ISO-8859-7");
        .put("8859_8""ISO-8859-8");
        .put("8859_9""ISO-8859-9");
        .put("JIS""ISO-2022-JP");
        .put("SJIS""Shift_JIS");
        .put("EUCJIS""EUC-JP");
        .put("GB2312""GB2312");
        .put("BIG5""Big5");
        .put("KSC5601""EUC-KR");
        .put("ISO2022KR""ISO-2022-KR");
        .put("KOI8_R""KOI8-R");
        .put("CP037""EBCDIC-CP-US");
        .put("CP037""EBCDIC-CP-CA");
        .put("CP037""EBCDIC-CP-NL");
        .put("CP277""EBCDIC-CP-DK");
        .put("CP277""EBCDIC-CP-NO");
        .put("CP278""EBCDIC-CP-FI");
        .put("CP278""EBCDIC-CP-SE");
        .put("CP280""EBCDIC-CP-IT");
        .put("CP284""EBCDIC-CP-ES");
        .put("CP285""EBCDIC-CP-GB");
        .put("CP297""EBCDIC-CP-FR");
        .put("CP420""EBCDIC-CP-AR1");
        .put("CP424""EBCDIC-CP-HE");
        .put("CP500""EBCDIC-CP-CH");
        .put("CP870""EBCDIC-CP-ROECE");
        .put("CP870""EBCDIC-CP-YU");
        .put("CP871""EBCDIC-CP-IS");
        .put("CP918""EBCDIC-CP-AR2");
    }
    private MIME2Java() {
    }

    
Convert a MIME charset name, also known as an XML encoding name, to a Java encoding name.

Parameters:
mimeCharsetName Case insensitive MIME charset name: UTF-8, US-ASCII, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-2022-JP, Shift_JIS, EUC-JP, GB2312, Big5, EUC-KR, ISO-2022-KR, KOI8-R, EBCDIC-CP-US, EBCDIC-CP-CA, EBCDIC-CP-NL, EBCDIC-CP-DK, EBCDIC-CP-NO, EBCDIC-CP-FI, EBCDIC-CP-SE, EBCDIC-CP-IT, EBCDIC-CP-ES, EBCDIC-CP-GB, EBCDIC-CP-FR, EBCDIC-CP-AR1, EBCDIC-CP-HE, EBCDIC-CP-CH, EBCDIC-CP-ROECE, EBCDIC-CP-YU, EBCDIC-CP-IS and EBCDIC-CP-AR2.
Returns:
Java encoding name, or null if mimeCharsetName is unknown.
See also:
reverse(java.lang.String)
    public static String convert(String mimeCharsetName) {
        return (String).get(mimeCharsetName.toUpperCase(.));
    }

    
Convert a Java encoding name to MIME charset name. Available values of encoding are "UTF8", "8859_1", "8859_2", "8859_3", "8859_4", "8859_5", "8859_6", "8859_7", "8859_8", "8859_9", "JIS", "SJIS", "EUCJIS", "GB2312", "BIG5", "KSC5601", "ISO2022KR", "KOI8_R", "CP037", "CP277", "CP278", "CP280", "CP284", "CP285", "CP297", "CP420", "CP424", "CP500", "CP870", "CP871" and "CP918".

Parameters:
encoding Case insensitive Java encoding name: UTF8, 8859_1, 8859_2, 8859_3, 8859_4, 8859_5, 8859_6, 8859_7, 8859_8, 8859_9, JIS, SJIS, EUCJIS, GB2312, BIG5, KSC5601, ISO2022KR, KOI8_R, CP037, CP277, CP278, CP280, CP284, CP285, CP297, CP420, CP424, CP500, CP870, CP871 and CP918.
Returns:
MIME charset name, or null if encoding is unknown.
See also:
convert(java.lang.String)
    public static String reverse(String encoding) {
        return (String).get(encoding.toUpperCase(.));
    }
New to GrepCode? Check out our FAQ X