Java 主流的 Inputstream 转 String 的方法
1. Ways to convert an InputStream to a String:
1. Using IOUtils.toString (Apache Utils)
Stringresult =IOUtils.toString(inputStream,StandardCharsets.UTF_8);
2. Using CharStreams (Guava)
Stringresult =CharStreams.toString(newInputStreamReader(inputStream,Charsets.UTF_8));
3. Using Scanner (JDK)
Scanners =newScanner(inputStream).useDelimiter("\\A");Stringresult =s.hasNext()?s.next():"";
4. Using Stream API (Java 8). Warning: This solution converts different line breaks (like \r\n) to \n.
Stringresult =newBufferedReader(newInputStreamReader(inputStream)).lines().collect(Collectors.joining("\n"));
5. Using parallel Stream API (Java 8). Warning: This solution converts different line breaks (like \r\n) to \n.
Stringresult =newBufferedReader(newInputStreamReader(inputStream)).lines().parallel().collect(Collectors.joining("\n"));
6. Using InputStreamReader and StringBuilder (JDK)
intbufferSize =1024;char[]buffer =newchar[bufferSize];StringBuilderout =newStringBuilder();Readerin =newInputStreamReader(stream,StandardCharsets.UTF_8);for(intnumRead;(numRead =in.read(buffer,0,buffer.length))>0;){ out.append(buffer,0,numRead);}returnout.toString();
7. Using StringWriter and IOUtils.copy (Apache Commons)
StringWriterwriter =newStringWriter();IOUtils.copy(inputStream,writer,"UTF-8");returnwriter.toString();
8. Using ByteArrayOutputStream and inputStream.read (JDK)
ByteArrayOutputStreamresult =newByteArrayOutputStream();byte[]buffer =newbyte[1024];for(intlength;(length =inputStream.read(buffer))!=-1;){ result.write(buffer,0,length);}// StandardCharsets.UTF_8.name() > JDK 7returnresult.toString("UTF-8");
9. Using BufferedReader (JDK). Warning: This solution converts different line breaks (like \n\r) to line.separator system property (for example, in Windows to “\r\n”).
StringnewLine =System.getProperty("line.separator");BufferedReaderreader =newBufferedReader(newInputStreamReader(inputStream));StringBuilderresult =newStringBuilder();for(Stringline;(line =reader.readLine())!=null;){ if(result.length()>0){ result.append(newLine);}result.append(line);}returnresult.toString();
10. Using BufferedInputStream and ByteArrayOutputStream (JDK)
BufferedInputStreambis =newBufferedInputStream(inputStream);ByteArrayOutputStreambuf =newByteArrayOutputStream();for(intresult =bis.read();result !=-1;result =bis.read()){ buf.write((byte)result);}// StandardCharsets.UTF_8.name() > JDK 7returnbuf.toString("UTF-8");
11. Using inputStream.read() and StringBuilder (JDK). Warning: This solution has problems with Unicode, for example with Russian text (works correctly only with non-Unicode text)
StringBuildersb =newStringBuilder();for(intch;(ch =inputStream.read())!=-1;){ sb.append((char)ch);}returnsb.toString();
Warning:
Solutions 4, 5 and 9 convert different line breaks to one.
Solution 11 can’t work correctly with Unicode text
Performance tests
Performance tests for small String (length = 175), url in github (mode = Average Time, system = Linux, score 1,343 is the best):
Benchmark Mode Cnt Score Error Units 8. ByteArrayOutputStream and read (JDK) avgt 10 1,343 ± 0,028 us/op 6. InputStreamReader and StringBuilder (JDK) avgt 10 6,980 ± 0,404 us/op10. BufferedInputStream, ByteArrayOutputStream avgt 10 7,437 ± 0,735 us/op11. InputStream.read() and StringBuilder (JDK) avgt 10 8,977 ± 0,328 us/op 7. StringWriter and IOUtils.copy (Apache) avgt 10 10,613 ± 0,599 us/op 1. IOUtils.toString (Apache Utils) avgt 10 10,605 ± 0,527 us/op 3. Scanner (JDK) avgt 10 12,083 ± 0,293 us/op 2. CharStreams (guava) avgt 10 12,999 ± 0,514 us/op 4. Stream Api (Java 8) avgt 10 15,811 ± 0,605 us/op 9. BufferedReader (JDK) avgt 10 16,038 ± 0,711 us/op 5. parallel Stream Api (Java 8) avgt 10 21,544 ± 0,583 us/op
Performance tests for big String (length = 50100), url in github (mode = Average Time, system = Linux, score 200,715 is the best):
Benchmark Mode Cnt Score Error Units 8. ByteArrayOutputStream and read (JDK) avgt 10 200,715 ± 18,103 us/op 1. IOUtils.toString (Apache Utils) avgt 10 300,019 ± 8,751 us/op 6. InputStreamReader and StringBuilder (JDK) avgt 10 347,616 ± 130,348 us/op 7. StringWriter and IOUtils.copy (Apache) avgt 10 352,791 ± 105,337 us/op 2. CharStreams (guava) avgt 10 420,137 ± 59,877 us/op 9. BufferedReader (JDK) avgt 10 632,028 ± 17,002 us/op 5. parallel Stream Api (Java 8) avgt 10 662,999 ± 46,199 us/op 4. Stream Api (Java 8) avgt 10 701,269 ± 82,296 us/op12. BufferedInputStream, ByteArrayOutputStream avgt 10 740,837 ± 5,613 us/op 3. Scanner (JDK) avgt 10 751,417 ± 62,026 us/op11. InputStream.read() and StringBuilder (JDK) avgt 10 2919,350 ± 1101,942 us/op
Graphs (performance tests depending on Input Stream length in Windows 7 system)
Performance test (Average Time) depending on Input Stream length in Windows 7 system:
length 182 546 1092 3276 9828 29484 58968 test8 0.38 0.938 1.868 4.448 13.412 36.459 72.708 test4 2.362 3.609 5.573 12.769 40.74 81.415 159.864 test5 3.881 5.075 6.904 14.123 50.258 129.937 166.162 test9 2.237 3.493 5.422 11.977 45.98 89.336 177.39 test6 1.261 2.12 4.38 10.698 31.821 86.106 186.636 test7 1.601 2.391 3.646 8.367 38.196 110.221 211.016 test1 1.529 2.381 3.527 8.411 40.551 105.16 212.573 test3 3.035 3.934 8.606 20.858 61.571 118.744 235.428 test2 3.136 6.238 10.508 33.48 43.532 118.044 239.481 test10 1.593 4.736 7.527 20.557 59.856 162.907 323.147 test11 3.913 11.506 23.26 68.644 207.591 600.444 1211.545
2. nio 使用 Paths.get 的 java.nio.file.FileSystemNotFoundException 分析
Question:
I have a Maven project and inside a method I want to create a path for a directory in my resources folder. This is done like this:
try{ finalURIuri =getClass().getResource("/my-folder").toURI();PathmyFolderPath =Paths.get(uri);}catch(finalURISyntaxExceptione){ ...}
The generated URI looks like jar:file:/C:/path/to/my/project.jar!/my-folder
.
The stacktrace is as following:
Exceptionin thread "pool-4-thread-1"java.nio.file.FileSystemNotFoundExceptionat com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:171)at com.sun.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:157)at java.nio.file.Paths.get(Paths.java:143)
The URI seems to be valid. The part before ! points to the generated jar-file and the part after it to my-folder in the root of the archive. I have used this instructions before to create paths to my resources. Why am I getting an exception now?
Answer:
You need to create the file system before you can access the path within the zip like
finalURIuri =getClass().getResource("/my-folder").toURI();Map<String,String>env =newHashMap<>();env.put("create","true");FileSystemzipfs =FileSystems.newFileSystem(uri,env);PathmyFolderPath =Paths.get(uri);
This is not done automatically.
See http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html
or
privateFileSysteminitFileSystem(URIuri)throwsIOException{ try{ returnFileSystems.newFileSystem(uri,Collections.emptyMap());}catch(IllegalArgumentExceptione){ returnFileSystems.getDefault();}}
or
privateFileSysteminitFileSystem(URIuri)throwsIOException{ try{ returnFileSystems.getFileSystem(uri);}catch(FileSystemNotFoundExceptione ){ Map<String,String>env =newHashMap<>();env.put("create","true");returnFileSystems.newFileSystem(uri,env);}}
Calling this with the URI you are about to load will ensure the filesystem is in working condition. I always call FileSystem.close() after using it:
FileSystemzipfs =initFileSystem(fileURI);filePath =Paths.get(fileURI);// Do whatever you need and then close the filesystemzipfs.close();
Careful, a ZipFileSystem can be closed, but a WindowsFileSystem will complain.
3. 在使用nio加载文件时,在idea中运行没有问题,但打成jar包后在windows和linux下都有问题
publicvoidtest()throwsException{ URIuri =getClass().getClassLoader().getResource("conf/sh.txt").toURI();FileSystemaDefault =FileSystems.getDefault();System.out.println(aDefault.getClass());FileSystemProviderprovider =FileSystems.getDefault().provider();System.out.println(provider.getClass());System.out.println("===================="+uri.getScheme());List<FileSystemProvider>fileSystemProviders =FileSystemProvider.installedProviders();fileSystemProviders.forEach(p ->System.out.println(p.getClass()));Pathpath =Paths.get(uri);}
这种情况下在idea中没有问题:
classsun.nio.fs.WindowsFileSystemclasssun.nio.fs.WindowsFileSystemProvider====================fileclass sun.nio.fs.WindowsFileSystemProviderclasscom.sun.nio.zipfs.ZipFileSystemProvider
但是在打成jar包运行时Path path = Paths.get(uri)这一行会抛出异常:
Exceptionin thread "pool-4-thread-1"java.nio.file.FileSystemNotFoundExceptionat com.sun.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:171)at com.sun.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:157)at java.nio.file.Paths.get(Paths.java:143)
究其原因,是FileSystemProvider的使用问题,先看java.nio.file.Paths#get(java.net.URI):
publicstaticPathget(URIuri){ Stringscheme =uri.getScheme();if(scheme ==null)thrownewIllegalArgumentException("Missing scheme");// check for default provider to avoid loading of installed providersif(scheme.equalsIgnoreCase("file"))returnFileSystems.getDefault().provider().getPath(uri);// try to find providerfor(FileSystemProviderprovider :FileSystemProvider.installedProviders()){ if(provider.getScheme().equalsIgnoreCase(scheme)){ returnprovider.getPath(uri);}}thrownewFileSystemNotFoundException("Provider "+scheme +" not installed");}
- uri.getScheme()在idea中是file,在打成jar包后变成了jar。
- 当前缀以file开头时,会使用FileSystems.getDefault().provider()来处理,这个provider在windows环境下是WindowsFileSystemProvider, 在linux环境下是LinuxFileSystemProvider。
- FileSystemProvider.installedProviders()对应windows中的WindowsFileSystemProvider和ZipFileSystemProvider,对应linux中的LinuxFileSystemProvider和ZipFileSystemProvider。
- 当前缀不以file开头时,会使用FileSystemProvider.installedProviders()中与uri.getScheme()匹配的provider来处理,对应的就是ZipFileSystemProvider。
- ZipFileSystemProvider对应的FileSystem需要自己创建,使用和创建方式参考:https://docs.oracle.com/javase/8/docs/technotes/guides/io/fsp/zipfilesystemprovider.html
解决办法:
在 Path path = Paths.get(uri)
中进行处理
Pathpath =null;try{ path =Paths.get(uri);}catch(Exceptione){ // @see https://stackoverflow.com/questions/25032716/getting-filesystemnotfoundexception-from-zipfilesystemprovider-when-creating-a-p// @see http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.htmlMap<String,String>env =newHashMap<>();env.put("create","true");FileSystemzipfs =FileSystems.newFileSystem(uri,env);path =Paths.get(uri);}
或者使用其他办法加载资源文件:
byte[]data;try(InputStreamin =getClass().getResourceAsStream("/elasticsearch/segmentsIndex.json")){ data =IOUtils.toByteArray(in);}
4. 获取 InputStream
的方式
- Path 和 File 带横杠是基于根目录, 不带是基于当前目录
Pathproblems =Paths.get("当前项目路径");// new File("");Pathproblems =Paths.get("/根路径 (Windows下是盘符的路径)");// new File("/");
- 从
/resource
中读取资源文件
路径以
/
开头代表 Resource 目录, 没有/
开头代表包名的相对路径,步骤: 先获取当前的 Class 对象,然后调用 getResourceAsStream() 即可
如果是
getResource()
返回的是 URL, 在 Windows 下不能直接用, 会显示路径不对, 多了一个冒号:Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <:> at index 2: /D:/XXXXX
Path 要结合 Files 使用 (
Files.walk(path)
,Files.isDirectory(path)
,Files.readAllLines(path)
,Files.newBufferedReader(path)
, etc.)
注意:
- class + 不带
/
:getResource("");
和getResource("filename");
是不一样的- 空字符
""
: 返回 /build/classes/java/main/com/xxx/xx/x (不带前导/
, 就是包路径) - 有字符分 2 种情况:
- 文件存在返回 /build/resources/main/com/xxx/xx/x (不带前导
/
, 就是/resource
中的包名路径下) - 文件不存在, 直接返回
null
- 文件存在返回 /build/resources/main/com/xxx/xx/x (不带前导
- 空字符
- class + 带
/
:getResource("/");
, 会返回 /build/classes
/java/main/ (Gradle 环境下, 不论什么系统都是这个路径)- 带路径, 返回 /build/
resource
/java/main/ 目录下的文件 (常用)
- classLoader + 不带
/
:getClassLoader().getResource("");
, 返回结果同getResource("/");
(常用) - classLoader + 带
/
:getClassLoader().getResource("/");
, 直接返回null
(后面解释)
/** * 直接通过文件名+getPath()来获取路径 * * @param fileName * @throws IOException */publicvoidfunc00(StringfileName)throwsIOException{ Stringpath =this.getClass().getClassLoader().getResource(fileName).getPath();// 注意getResource("")里面是空字符串System.out.println(path);StringfilePath =URLDecoder.decode(path,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码System.out.println(filePath);getFileContent(filePath);}/** * 直接通过文件名+getFile()来获取 * * url.getFile()=/pub/files/foobar.txt?id=123456 * url.getPath()=/pub/files/foobar.txt * * @param fileName * @throws IOException */publicvoidfunc01(StringfileName)throwsIOException{ Stringpath =this.getClass().getClassLoader().getResource(fileName).getFile();// 注意getResource("")里面是空字符串System.out.println(path);StringfilePath =URLDecoder.decode(path,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码System.out.println(filePath);getFileContent(filePath);}/** * 直接使用getResourceAsStream方法获取流 * springboot项目中需要使用此种方法,因为jar包中没有一个实际的路径存放文件 * * @param fileName * @throws IOException */publicvoidfunc02(StringfileName)throwsIOException{ StringfilePath =URLDecoder.decode(fileName,"UTF-8");// 如果路径中带有中文会被URLEncoder,因此这里需要解码InputStreamin =this.getClass().getClassLoader().getResourceAsStream(filePath);getFileContent(in);}/** * 通过ClassPathResource类获取,建议SpringBoot中使用 * springboot项目中需要使用此种方法,因为jar包中没有一个实际的路径存放文件 * * @param fileName * @throws IOException */publicvoidfunc03(StringfileName)throwsIOException{ ClassPathResourceclassPathResource =newClassPathResource(fileName);InputStreaminputStream =classPathResource.getInputStream();getFileContent(inputStream);}/** * 通过绝对路径获取项目中文件的位置(通过new File("")获取当前的绝对路径,只是本地绝对路径,不能用于服务器) * * @param fileName * @throws IOException */publicvoidfunc04(StringfileName)throwsIOException{ // 参数为空Filedirectory =newFile("");// 规范路径:getCanonicalPath() 方法返回绝对路径,会把 ..\ 、.\ 这样的符号解析掉StringrootCanonicalPath =directory.getCanonicalPath();// 绝对路径:getAbsolutePath() 方法返回文件的绝对路径,如果构造的时候是全路径就直接返回全路径,如果构造时是相对路径,就返回当前目录的路径 + 构造 File 对象时的路径StringrootAbsolutePath =directory.getAbsolutePath();System.out.println(rootCanonicalPath);System.out.println(rootAbsolutePath);StringfilePath =rootCanonicalPath +"\\chapter-2-springmvc-quickstart\\src\\main\\resources\\"+fileName;getFileContent(filePath);}
5. 读取 InputStream
的方式
普通的 Reader 实际上是基于 InputStream 构造的
Reader 尽量加上编码设置, 因为 Reader 需要从 InputStream 中读入字节流 (byte),然后根据 编码设置,再转换为字符 (char) 就可以实现字符流 (Reader
)
Reader 本质上是一个基于 InputStream 的 byte 到 char 的转换器
- FileReader
打开文件并获取 Reader; 默认的编码与系统相关; 如果我们查看 FileReader 的源码, 内部实际上持有一个 FileInputStream, 需要被正确关闭
- CharArrayReader
CharArrayReader 可以在内存中模拟一个 Reader,和 ByteArrayInputStream 类似
- StringReader
StringReader 可以直接把 String 作为数据源,它和 CharArrayReader 几乎一样
- InputStreamReader
如果我们已经有一个 InputStream 想把它转换为Reader, 那么 InputStreamReader
就是这样一个转换器
- BufferedReader
有了 InputStreamReader, 我们想按行读取, 就可以使用 BufferedReader
. 只接受 Reader 作为构造参数, 是一个包装
一类是直接提供数据;
一类是提供额外附加功能; 类似于 FilterInputStream (Filter 模式)
// BufferedReaderStringinContent =newBufferedReader(newInputStreamReader(inputStream,StandardCharsets.UTF_8)).lines()// .skip(13) // 可选择跳过多少行// .parallel() // 可选择并行.collect(Collectors.joining("\n"));// StringBuilderfinalintbufferSize =4*0x400;// 4KBchar[]buffer =newchar[bufferSize];StringBuildersb =newStringBuilder();Readerin =newInputStreamReader(inputStream,StandardCharsets.UTF_8);for(intnumRead;(numRead =in.read(buffer,0,bufferSize)>0;){ sb.append(buffer,0,numRead);}StringinContent =sb.toString();
6. ClassLoader#getResourceAsStream()
不能使用 前导斜杠 /
No leading “/” (all names are absolute)
https://stackoverflow.com/questions/47900677/where-does-leading-slash-in-java-class-loader-getresource-leads-to
Leading slash works only for class.getResource() to override its default behavior. There is no leading slash concept for class.getClassLoader().getResource(), so it always returns null.
https://stackoverflow.com/questions/3803326/this-getclass-getclassloader-getresource-and-nullpointerexception
The reason you can’t use a leading /
in the ClassLoader path is because all ClassLoader paths are absolute and so /
is not a valid first character in the path.
Reference
- stackoverflow
- Paths.get(uri)
- nio使用中的java.nio.file.FileSystemNotFoundException分析