io - Java的DataInputStream的readUTF方法是怎么读取字符串的？？？

Question

大神们好，我是新手小白，今天在学习Java的IO操作中遇到一个百思不得其解的问题，下面的代码是今天做的关于DataInputStream类的练习，我很不解为什么DataInputStream的readUTF方法不需要任何参数，但是却在读取的...

ringa_lee · Answer

DataOutputStream

static int writeUTF(String str, DataOutput out) throws IOException {
    int strlen = str.length();
    int utflen = 0;
    int c, count = 0;

    /* use charAt instead of copying String to char array */
    for (int i = 0; i < strlen; i++) {
        c = str.charAt(i);
        if ((c >= 0x0001) && (c <= 0x007F)) {
            utflen++;
        } else if (c > 0x07FF) {
            utflen += 3;
        } else {
            utflen += 2;
        }
    }

    if (utflen > 65535)
        throw new UTFDataFormatException(
            "encoded string too long: " + utflen + " bytes");

    byte[] bytearr = null;
    if (out instanceof DataOutputStream) {
        DataOutputStream dos = (DataOutputStream)out;
        if(dos.bytearr == null || (dos.bytearr.length < (utflen+2)))
            dos.bytearr = new byte[(utflen*2) + 2];
        bytearr = dos.bytearr;
    } else {
        bytearr = new byte[utflen+2];
    }

    // 将字符串的字节长度写入流中
    bytearr[count++] = (byte) ((utflen >>> 8) & 0xFF);
    bytearr[count++] = (byte) ((utflen >>> 0) & 0xFF);

    int i=0;
    for (i=0; i= 0x0001) && (c <= 0x007F))) break;
       bytearr[count++] = (byte) c;
    }

    for (;i < strlen; i++){
        c = str.charAt(i);
        if ((c >= 0x0001) && (c <= 0x007F)) {
            bytearr[count++] = (byte) c;

        } else if (c > 0x07FF) {
            bytearr[count++] = (byte) (0xE0 | ((c >> 12) & 0x0F));
            bytearr[count++] = (byte) (0x80 | ((c >>  6) & 0x3F));
            bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
        } else {
            bytearr[count++] = (byte) (0xC0 | ((c >>  6) & 0x1F));
            bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
        }
    }
    out.write(bytearr, 0, utflen+2);
    // 写入的长度在字符串中增加了2，即字节长度标识所占用的资源
    return utflen + 2;
}

DataInputStream

public final static String readUTF(DataInput in) throws IOException {
    // 读取字符串字节长度
    int utflen = in.readUnsignedShort();
    //...
}

其實在呼叫writeUTF写入时jdk內部有將字串的位元組數寫入流中，讀取時先讀取到位元組長度，依照指定的位元組長度讀取出對應的字串。

ringa_lee · Answer

看源碼，呼叫的第一句話就獲得了長度
int utflen = in.readUnsignedShort();
這個方法的文檔：

讀取兩個輸入位元組並傳回 0 到
範圍內的 int 值
設a為讀取的第一個字節，b為第二個位元組。傳回的值為：

(((a & 0xff) writeShort 的參數是一個值，則此方法適合讀取
介面DataOutput 的writeShort 方法寫入的位元組範圍為0
到65535。傳回：讀取的無符號 16 位元值。拋出：
EOFException - 如果此流在讀取所有
之前到達末尾
IOException - 如果發生 I/O 錯誤。

閱讀UTF的文檔：

以 Unicode 字元的表示形式從流中讀取
以修改後的 UTF-8 格式編碼的字串；該字串
然後作為字串傳回。修改後的UTF-8
表示的細節與
DataInput的readUTF方法完全相同。