Java 主流的 Inputstream 转 String 的方法

1. Ways to convert an InputStream to a String:

1. Using IOUtils.toString (Apache Utils)
Stringresult =IOUtils.toString(inputStream,StandardCharsets.UTF_8);
2. Using CharStreams (Guava)
Stringresult =CharStreams.toString(newInputStreamReader(inputStream,Charsets.UTF_8));
3. Using Scanner (JDK)
Scanners =newScanner(inputStream).useDelimiter("\\A");Stringresult =s.hasNext()?s.next():"";
4. Using Stream API (Java 8). Warning: This solution converts different line breaks (like \r\n) to \n.
Stringresult =newBufferedReader(newInputStreamReader(inputStream)).lines().collect(Collectors.joining("\n"));
5. Using parallel Stream API (Java 8). Warning: This solution converts different line breaks (like \r\n) to \n.
Stringresult =newBufferedReader(newInputStreamReader(inputStream)).lines().parallel().collect(Collectors.joining("\n"));
6. Using InputStreamReader and StringBuilder (JDK)
intbufferSize =1024;char[]buffer =newchar[bufferSize];StringBuilderout =newStringBuilder();Readerin =newInputStreamReader(stream,StandardCharsets.UTF_8);for(intnumRead;(numRead =in.read(buffer,0,buffer.length))>0;){ out.append(buffer,0,numRead);}returnout.toString();
7. Using StringWriter and IOUtils.copy (Apache Commons)
StringWriterwriter =newStringWriter();IOUtils.copy(inputStream,writer,"UTF-8");returnwriter.toString();
8. Using ByteArrayOutputStream and inputStream.read (JDK)
ByteArrayOutputStreamresult =newByteArrayOutputStream();byte[]buffer =newbyte[1024];for(intlength;(length =inputStream.read(buffer))!=-1;){ result.write(buffer,0,length);}// StandardCharsets.UTF_8.name() > JDK 7returnresult.toString("UTF-8");
9. Using BufferedReader (JDK). Warning: This solution converts different line breaks (like \n\r) to line.separator system property (for example, in Windows to “\r\n”).
StringnewLine =System.getProperty("line.separator");BufferedReaderreader =newBufferedReader(newInputStreamReader(inputStream));StringBuilderresult =newStringBuilder();for(Stringline;(line =reader.readLine())!=null;){ if(result.length()>0){ result.append(newLine);}result.append(line);}returnresult.toString();
10. Using BufferedInputStream and ByteArrayOutputStream (JDK)
BufferedInputStreambis =newBufferedInputStream(inputStream);ByteArrayOutputStreambuf =newByteArrayOutputStream();for(intresult =bis.read();result !=-1;result =bis.read()){ buf.write((byte)result);}// StandardCharsets.UTF_8.name() > JDK 7returnbuf.toString("UTF-8");
11. Using inputStream.read() and StringBuilder (JDK). Warning: This solution has problems with Unicode, for example with Russian text (works correctly only with non-Unicode text)
StringBuildersb =newStringBuilder();for(intch;(ch =inputStream.read())!=-1;){ sb.append((char)ch);}returnsb.toString();

Warning:

  1. Solutions 4, 5 and 9 convert different line breaks to one.

  2. Solution 11 can’t work correctly with Unicode text

Performance tests

Performance tests for small String (length = 175), url in github (mode = Average Time, system = Linux, score 1,343 is the best):

Benchmark                         Mode  Cnt   Score   Error  Units 8. ByteArrayOutputStream and read (JDK)        avgt   10   1,343 ± 0,028  us/op 6. InputStreamReader and StringBuilder (JDK)   avgt   10   6,980 ± 0,404  us/op10. BufferedInputStream, ByteArrayOutputStream  avgt   10   7,437 ± 0,735  us/op11. InputStream.read() and StringBuilder (JDK)  avgt   10   8,977 ± 0,328  us/op 7. StringWriter and IOUtils.copy (Apache)      avgt   10  10,613 ± 0,599  us/op 1. IOUtils.toString (Apache Utils)             avgt   10  10,605 ± 0,527  us/op 3. Scanner (JDK)                               avgt   10  12,083 ± 0,293  us/op 2. CharStreams (guava)                         avgt   10  12,999 ± 0,514  us/op 4. Stream Api (Java 8)                         avgt   10  15,811 ± 0,605  us/op 9. BufferedReader (JDK)                        avgt   10  16,038 ± 0,711  us/op 5. parallel Stream Api (Java 8)                avgt   10  21,544 ± 0,583  us/op

Performance tests for big String (length = 50100), url in github (mode = Average Time, system = Linux, score 200,715 is the best):

Benchmark                        Mode  Cnt   Score        Error  Units 8. ByteArrayOutputStream and read (JDK)        avgt   10   200,715 ±   18,103  us/op 1. IOUtils.toString (Apache Utils)             avgt   10   300,019 ±    8,751  us/op 6. InputStreamReader and StringBuilder (JDK)   avgt   10   347,616 ±  130,348  us/op 7. StringWriter and IOUtils.copy (Apache)      avgt   10   352,791 ±  105,337  us/op 2. CharStreams (guava)                         avgt   10   420,137 ±   59,877  us/op 9. BufferedReader (JDK)                        avgt   10   632,028 ±   17,002  us/op 5. parallel Stream Api (Java 8)                avgt   10   662,999 ±   46,199  us/op 4. Stream Api (Java 8)                         avgt   10   701,269 ±   82,296  us/op12. BufferedInputStream, ByteArrayOutputStream  avgt   10   740,837 ±    5,613  us/op 3. Scanner (JDK)                               avgt   10   751,417 ±   62,026  us/op11. InputStream.read() and StringBuilder (JDK)  avgt   10  2919,350 ± 1101,942  us/op

Graphs (performance tests depending on Input Stream length in Windows 7 system)
enter image description here

Performance test (Average Time) depending on Input Stream length in Windows 7 system:

length  182    546     1092    3276    9828    29484   58968 test8  0.38    0.938   1.868   4.448   13.412  36.459  72.708 test4  2.362   3.609   5.573   12.769  40.74   81.415  159.864 test5  3.881   5.075   6.904   14.123  50.258  129.937 166.162 test9  2.237   3.493   5.422   11.977  45.98   89.336  177.39 test6  1.261   2.12    4.38    10.698  31.821  86.106  186.636 test7  1.601   2.391   3.646   8.367   38.196  110.221 211.016 test1  1.529   2.381   3.527   8.411   40.551  105.16  212.573 test3  3.035   3.934   8.606   20.858  61.571  118.744 235.428 test2  3.136   6.238   10.508  33.48   43.532  118.044 239.481 test10 1.593   4.736   7.527   20.557  59.856  162.907 323.147 test11 3.913   11.506  23.26   68.644  207.591 600.444 1211.545

2. nio 使用 Paths.get 的 java.nio.file.FileSystemNotFoundException 分析

Question:

I have a Maven project and inside a method I want to create a path for a directory in my resources folder. This is done like this:

try{ finalURIuri =getClass().getResource("/my-folder").toURI();PathmyFolderPath =Paths.get(uri);}catch(finalURISyntaxExceptione){ ...}

The generated URI looks like jar:file:/C:/path/to/my/project.jar!/my-folder.

The stacktrace is as following:

Exceptionin thread "pool-4-thread-1"java.nio.file.FileSystemNotFoundExceptionat com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:171)at com.sun.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:157)at java.nio.file.Paths.get(Paths.java:143)

The URI seems to be valid. The part before ! points to the generated jar-file and the part after it to my-folder in the root of the archive. I have used this instructions before to create paths to my resources. Why am I getting an exception now?

Answer:

You need to create the file system before you can access the path within the zip like

finalURIuri =getClass().getResource("/my-folder").toURI();Map<String,String>env =newHashMap<>();env.put("create","true");FileSystemzipfs =FileSystems.newFileSystem(uri,env);PathmyFolderPath =Paths.get(uri);

This is not done automatically.

See http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html

or

privateFileSysteminitFileSystem(URIuri)throwsIOException{ try{ returnFileSystems.newFileSystem(uri,Collections.emptyMap());}catch(IllegalArgumentExceptione){ returnFileSystems.getDefault();}}

or

privateFileSysteminitFileSystem(URIuri)throwsIOException{ try{ returnFileSystems.getFileSystem(uri);}catch(FileSystemNotFoundExceptione ){ Map<String,String>env =newHashMap<>();env.put("create","true");returnFileSystems.newFileSystem(uri,env);}}

Calling this with the URI you are about to load will ensure the filesystem is in working condition. I always call FileSystem.close() after using it:

FileSystemzipfs =initFileSystem(fileURI);filePath =Paths.get(fileURI);// Do whatever you need and then close the filesystemzipfs.close();

Careful, a ZipFileSystem can be closed, but a WindowsFileSystem will complain.

3. 在使用nio加载文件时,在idea中运行没有问题,但打成jar包后在windows和linux下都有问题

publicvoidtest()throwsException{ URIuri      =getClass().getClassLoader().getResource("conf/sh.txt").toURI();FileSystemaDefault =FileSystems.getDefault();System.out.println(aDefault.getClass());FileSystemProviderprovider =FileSystems.getDefault().provider();System.out.println(provider.getClass());System.out.println("===================="+uri.getScheme());List<FileSystemProvider>fileSystemProviders =FileSystemProvider.installedProviders();fileSystemProviders.forEach(p ->System.out.println(p.getClass()));Pathpath =Paths.get(uri);}

这种情况下在idea中没有问题:

classsun.nio.fs.WindowsFileSystemclasssun.nio.fs.WindowsFileSystemProvider====================fileclass sun.nio.fs.WindowsFileSystemProviderclasscom.sun.nio.zipfs.ZipFileSystemProvider

但是在打成jar包运行时Path path = Paths.get(uri)这一行会抛出异常:

Exceptionin thread "pool-4-thread-1"java.nio.file.FileSystemNotFoundExceptionat com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:171)at com.sun.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:157)at java.nio.file.Paths.get(Paths.java:143)

究其原因,是FileSystemProvider的使用问题,先看java.nio.file.Paths#get(java.net.URI):

publicstaticPathget(URIuri){ Stringscheme =uri.getScheme();if(scheme ==null)thrownewIllegalArgumentException("Missing scheme");// check for default provider to avoid loading of installed providersif(scheme.equalsIgnoreCase("file"))returnFileSystems.getDefault().provider().getPath(uri);// try to find providerfor(FileSystemProviderprovider :FileSystemProvider.installedProviders()){ if(provider.getScheme().equalsIgnoreCase(scheme)){ returnprovider.getPath(uri);}}thrownewFileSystemNotFoundException("Provider "+scheme +" not installed");}
  • uri.getScheme()在idea中是file,在打成jar包后变成了jar。
  • 当前缀以file开头时,会使用FileSystems.getDefault().provider()来处理,这个provider在windows环境下是WindowsFileSystemProvider, 在linux环境下是LinuxFileSystemProvider。
  • FileSystemProvider.installedProviders()对应windows中的WindowsFileSystemProvider和ZipFileSystemProvider,对应linux中的LinuxFileSystemProvider和ZipFileSystemProvider。
  • 当前缀不以file开头时,会使用FileSystemProvider.installedProviders()中与uri.getScheme()匹配的provider来处理,对应的就是ZipFileSystemProvider。
  • ZipFileSystemProvider对应的FileSystem需要自己创建,使用和创建方式参考:https://docs.oracle.com/javase/8/docs/technotes/guides/io/fsp/zipfilesystemprovider.html

解决办法:

Path path = Paths.get(uri)中进行处理

Pathpath =null;try{ path =Paths.get(uri);}catch(Exceptione){ // @see https://stackoverflow.com/questions/25032716/getting-filesystemnotfoundexception-from-zipfilesystemprovider-when-creating-a-p// @see http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.htmlMap<String,String>env =newHashMap<>();env.put("create","true");FileSystemzipfs =FileSystems.newFileSystem(uri,env);path =Paths.get(uri);}

或者使用其他办法加载资源文件:

byte[]data;try(InputStreamin =getClass().getResourceAsStream("/elasticsearch/segmentsIndex.json")){ data =IOUtils.toByteArray(in);}

4. 获取 InputStream的方式

  • Path 和 File 带横杠是基于根目录, 不带是基于当前目录
Pathproblems =Paths.get("当前项目路径");// new File("");Pathproblems =Paths.get("/根路径 (Windows下是盘符的路径)");// new File("/");
  • /resource中读取资源文件
  1. 路径以 /开头代表 Resource 目录, 没有 /开头代表包名的相对路径,

  2. 步骤: 先获取当前的 Class 对象,然后调用 getResourceAsStream() 即可

  3. 如果是 getResource()返回的是 URL, 在 Windows 下不能直接用, 会显示路径不对, 多了一个冒号:
    Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <:> at index 2: /D:/XXXXX

  4. Path 要结合 Files 使用 (Files.walk(path), Files.isDirectory(path), Files.readAllLines(path), Files.newBufferedReader(path), etc.)

注意:

  • class + 不带 /: getResource("");getResource("filename");是不一样的
    • 空字符 "": 返回 /build/classes/java/main/com/xxx/xx/x (不带前导 /, 就是包路径)
    • 有字符分 2 种情况:
      • 文件存在返回 /build/resources/main/com/xxx/xx/x (不带前导 /, 就是 /resource中的包名路径下)
      • 文件不存在, 直接返回 null
  • class + 带 /:
    • getResource("/");, 会返回 /build/classes/java/main/ (Gradle 环境下, 不论什么系统都是这个路径)
    • 带路径, 返回 /build/resource/java/main/ 目录下的文件 (常用)
  • classLoader + 不带 /: getClassLoader().getResource("");, 返回结果同 getResource("/");(常用)
  • classLoader + 带 /: getClassLoader().getResource("/");, 直接返回 null(后面解释)
/** * 直接通过文件名+getPath()来获取路径 * * @param fileName * @throws IOException */publicvoidfunc00(StringfileName)throwsIOException{ Stringpath =this.getClass().getClassLoader().getResource(fileName).getPath();// 注意getResource("")里面是空字符串System.out.println(path);StringfilePath =URLDecoder.decode(path,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码System.out.println(filePath);getFileContent(filePath);}/** * 直接通过文件名+getFile()来获取 *  * url.getFile()=/pub/files/foobar.txt?id=123456 * url.getPath()=/pub/files/foobar.txt * * @param fileName * @throws IOException */publicvoidfunc01(StringfileName)throwsIOException{ Stringpath =this.getClass().getClassLoader().getResource(fileName).getFile();// 注意getResource("")里面是空字符串System.out.println(path);StringfilePath =URLDecoder.decode(path,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码System.out.println(filePath);getFileContent(filePath);}/** * 直接使用getResourceAsStream方法获取流 * springboot项目中需要使用此种方法,因为jar包中没有一个实际的路径存放文件 * * @param fileName * @throws IOException */publicvoidfunc02(StringfileName)throwsIOException{ StringfilePath =URLDecoder.decode(fileName,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码InputStreamin =this.getClass().getClassLoader().getResourceAsStream(filePath);getFileContent(in);}/** * 通过ClassPathResource类获取,建议SpringBoot中使用 * springboot项目中需要使用此种方法,因为jar包中没有一个实际的路径存放文件 * * @param fileName * @throws IOException */publicvoidfunc03(StringfileName)throwsIOException{ ClassPathResourceclassPathResource =newClassPathResource(fileName);InputStreaminputStream =classPathResource.getInputStream();getFileContent(inputStream);}/** * 通过绝对路径获取项目中文件的位置(通过new File("")获取当前的绝对路径,只是本地绝对路径,不能用于服务器) *  * @param fileName * @throws IOException */publicvoidfunc04(StringfileName)throwsIOException{ // 参数为空Filedirectory =newFile("");// 规范路径:getCanonicalPath() 方法返回绝对路径,会把 ..\ 、.\ 这样的符号解析掉StringrootCanonicalPath =directory.getCanonicalPath();// 绝对路径:getAbsolutePath() 方法返回文件的绝对路径,如果构造的时候是全路径就直接返回全路径,如果构造时是相对路径,就返回当前目录的路径 + 构造 File 对象时的路径StringrootAbsolutePath =directory.getAbsolutePath();System.out.println(rootCanonicalPath);System.out.println(rootAbsolutePath);StringfilePath =rootCanonicalPath +"\\chapter-2-springmvc-quickstart\\src\\main\\resources\\"+fileName;getFileContent(filePath);}

5. 读取 InputStream的方式

普通的 Reader 实际上是基于 InputStream 构造的
Reader 尽量加上编码设置, 因为 Reader 需要从 InputStream 中读入字节流 (byte),然后根据 编码设置,再转换为字符 (char) 就可以实现字符流 (Reader)
Reader 本质上是一个基于 InputStream 的 byte 到 char 的转换器

  • FileReader

打开文件并获取 Reader; 默认的编码与系统相关; 如果我们查看 FileReader 的源码, 内部实际上持有一个 FileInputStream, 需要被正确关闭

  • CharArrayReader

CharArrayReader 可以在内存中模拟一个 Reader,和 ByteArrayInputStream 类似

  • StringReader

StringReader 可以直接把 String 作为数据源,它和 CharArrayReader 几乎一样

  • InputStreamReader

如果我们已经有一个 InputStream 想把它转换为Reader, 那么 InputStreamReader就是这样一个转换器

  • BufferedReader

有了 InputStreamReader, 我们想按行读取, 就可以使用 BufferedReader. 只接受 Reader 作为构造参数, 是一个包装

一类是直接提供数据;
一类是提供额外附加功能; 类似于 FilterInputStream (Filter 模式)

// BufferedReaderStringinContent =newBufferedReader(newInputStreamReader(inputStream,StandardCharsets.UTF_8)).lines()// .skip(13) // 可选择跳过多少行// .parallel() // 可选择并行.collect(Collectors.joining("\n"));// StringBuilderfinalintbufferSize =4*0x400;// 4KBchar[]buffer =newchar[bufferSize];StringBuildersb =newStringBuilder();Readerin =newInputStreamReader(inputStream,StandardCharsets.UTF_8);for(intnumRead;(numRead =in.read(buffer,0,bufferSize)>0;){ sb.append(buffer,0,numRead);}StringinContent =sb.toString();

6. ClassLoader#getResourceAsStream()不能使用 前导斜杠 /

No leading “/” (all names are absolute)

https://stackoverflow.com/questions/47900677/where-does-leading-slash-in-java-class-loader-getresource-leads-to
Leading slash works only for class.getResource() to override its default behavior. There is no leading slash concept for class.getClassLoader().getResource(), so it always returns null.

https://stackoverflow.com/questions/3803326/this-getclass-getclassloader-getresource-and-nullpointerexception
The reason you can’t use a leading /in the ClassLoader path is because all ClassLoader paths are absolute and so /is not a valid first character in the path.

Reference

  • stackoverflow
  • Paths.get(uri)
  • nio使用中的java.nio.file.FileSystemNotFoundException分析