SSM使用POI组件读取上传的word文档内容

in 编程技术 with 0 comment

最近毕设,有一个功能就是实现文档的上传并把上传的文档内容读取出来,然后保存到数据库中,之前课设用到过apache的poi,自然就想到了用poi组件,接下来就实现(记录)一下。

POI组件下载

直接去官网下载最新版zip包。

使用

解压后的目录如下:
image.png
image.png
image.png
为了后边导入excel表格做解析,我把需要的包全部导进去,有的在整合SSM的时候已经有了,就不用再导了。接下来就正式开始敲代码来实现一下。

JSP界面

<form class="form-horizontal" id="homework_submit">
	<input id="enclosure" name="enclosure" type="file" 
		accept="application/msword,application/vnd.openxmlformats-officedocument.wordprocessingml.document">
	<button type="button" onClick="homework_submit();">提交</button>
</form>
function homework_submit() {
	var url = .....
	$.ajax({
		type: 'POST',
		url: url,
		cache: false,
		data: new FormData($('#homework_submit')[0]),
		processData: false,
		contentType: false,
		success: function(data){
			//...
		},
		error:function(data) {
			//...
		},
	});	
}

FormData默认表单enctype="multipart/form-data",故这里可以不指定,当然,上传表单的方式有很多种,作为新手的我,哪种简单方便就用。

Controller

//提交作业
@RequestMapping(value="/saveHomework/{sId}",method=RequestMethod.POST)
@ResponseBody
public Integer saveHomework(HttpServletResponse response, @ModelAttribute MultipartFile enclosure,Submitted submitted,HttpSession session) {
	int result = submittedService.insertSubmitted(response,enclosure, submitted, session);
	return result;
}

这里需要注意的是,@ModelAttribute MultipartFile enclosure要和前端对应,当然,由于使用的是SSM框架,需要在applicationContext.xml中配置文件上传的解析器

<!-- 定义文件上传解析器 -->
<bean id="multipartResolver"
	class="org.springframework.web.multipart.commons.CommonsMultipartResolver">
	<!-- 设定默认编码 -->
	<property name="defaultEncoding" value="UTF-8"></property>
	<!-- 设定文件上传的最大值5MB,5*1024*1024 -->
	<property name="maxUploadSize" value="5242880"></property>
</bean>

Service

这里是逻辑的实现,具体如下:

@Override
public int insertSubmitted(HttpServletResponse response,MultipartFile enclosure, Submitted submitted, HttpSession session) {
	//MultipartFile转File
	CommonsMultipartFile cf= (CommonsMultipartFile)enclosure; 
        DiskFileItem fi = (DiskFileItem)cf.getFileItem(); 
        File file = fi.getStoreLocation();
        String content = null;
        if (! enclosure.isEmpty()) {
        	String originalFilename = enclosure.getOriginalFilename();
        	if (originalFilename.endsWith(".doc")) {
        		try {
                    		FileInputStream fis = new FileInputStream(file);
                    		@SuppressWarnings("resource")
				HWPFDocument doc = new HWPFDocument(fis);
                    		content = doc.getDocumentText();
                    		System.out.println(content);
                   		 fis.close();
                	} catch (Exception e) {
                    		e.printStackTrace();
               		}
		}else if (originalFilename.endsWith(".docx")) {
			try {
		            	FileInputStream fis = new FileInputStream(file);
		           	 XWPFDocument xdoc = new XWPFDocument(fis);
		            	@SuppressWarnings("resource")
				XWPFWordExtractor extractor = new XWPFWordExtractor(xdoc);
		            	content = extractor.getText();
		            	System.out.println(content);
		            	fis.close();
		        } catch (Exception e) {
		            e.printStackTrace();
		        }
		}else {
			//...
		}
	}
}

关于MultipartFile 转File,参考这里:http://www.cnblogs.com/hahaxiaoyu/p/5102900.html
后边发现Workbook wb = Workbook.getWorkbook(xxx .getInputStream());转换为输入流,直接读取,这个也挺好用。
content 就是读取出来的内容,insert存数据库,搞定。

poi的jar包对应的用途

ComponentApplication typeMaven artifactIdNotes
POIFSOLE2 FilesystempoiRequired to work with OLE2 / POIFS based files
HPSFOLE2 Property Setspoi
HSSFExcel XLSpoiFor HSSF only, if common SS is needed see below
HSLFPowerPoint PPTpoi-scratchpad
HWPFWord DOCpoi-scratchpad
HDGFVisio VSDpoi-scratchpad
HPBFPublisher PUBpoi-scratchpad
HSMFOutlook MSGpoi-scratchpad
OpenXML4JOOXMLpoi-ooxml plus one of poi-ooxml-schemas, ooxml-schemasOnly one schemas jar is needed, see below for differences
XSSFExcel XLSXpoi-ooxml
XSLFPowerPoint PPTXpoi-ooxml
XWPFWord DOCXpoi-ooxml
Common SSExcel XLS and XLSXpoi-ooxmlWorkbookFactory and friends all require poi-ooxml, not just core poi