springboot创建批处理服务 · springboot官方文档中文版

# 创建批处理服务本指南将引导您完成创建基本的批处理驱动的解决方案的过程。 ## 你会建立什么您将构建一个服务，该服务从CSV电子表格导入数据，使用自定义代码对其进行转换，然后将最终结果存储在数据库中。 ## 你需要什么 * 约15分钟 * 最喜欢的文本编辑器或IDE * [JDK 1.8](http://www.oracle.com/technetwork/java/javase/downloads/index.html) 或更高版本 * [Gradle 4+](http://www.gradle.org/downloads) 或 [Maven 3.2+](https://maven.apache.org/download.cgi) * 您还可以将代码直接导入到IDE中： * [弹簧工具套件（STS）](https://spring.io/guides/gs/sts) * [IntelliJ IDEA](https://spring.io/guides/gs/intellij-idea/) ## 如何完成本指南像大多数Spring 一样 [入门指南](https://spring.io/guides) ，您可以从头开始并完成每个步骤，也可以绕过您已经熟悉的基本设置步骤。无论哪种方式，您最终都可以使用代码。要 **从头开始** ，请继续进行“ [从Spring Initializr开始”](https://spring.io/guides/gs/batch-processing/#scratch) 。要 **跳过基础知识** ，请执行以下操作： * [下载](https://github.com/spring-guides/gs-batch-processing/archive/master.zip) 并解压缩本指南的源存储库，或使用对其进行克隆 [Git](https://spring.io/understanding/Git) ： `git clone [https://github.com/spring-guides/gs-batch-processing.git](https://github.com/spring-guides/gs-batch-processing.git)` * 光盘进入 `gs-batch-processing/initial` * 继续 [创造商务舱](https://spring.io/guides/gs/batch-processing/#initial) 。 **完成后** ，您可以根据中的代码检查结果 `gs-batch-processing/complete`. ## 业务数据通常，您的客户或业务分析师提供电子表格。对于这个简单的示例，您可以在以下位置找到一些虚构数据 `src/main/resources/sample-data.csv`: ~~~ Jill,Doe Joe,Doe Justin,Doe Jane,Doe John,Doe ~~~ 该电子表格在每行上包含一个名字和一个姓氏，用逗号分隔。这是Spring无需定制即可处理的相当普遍的模式。接下来，您需要编写一个SQL脚本来创建一个表来存储数据。您可以在以下位置找到这样的脚本 `src/main/resources/schema-all.sql`: ~~~ DROP TABLE people IF EXISTS; CREATE TABLE people ( person_id BIGINT IDENTITY NOT NULL PRIMARY KEY, first_name VARCHAR(20), last_name VARCHAR(20) ); ~~~ Spring Boot运行 schema-@@platform@@.sql 在启动过程中自动运行。 -all 是所有平台的默认设置。 ## 从Spring Initializr开始如果您使用Maven，请访问 [Spring Initializr](https://start.spring.io/#!type=maven-project&language=java&platformVersion=2.4.3.RELEASE&packaging=jar&jvmVersion=1.8&groupId=com.example&artifactId=batch-processing&name=batch-processing&description=Demo%20project%20for%20Spring%20Boot&packageName=com.example.batch-processing&dependencies=batch,hsql) 以生成具有所需依赖项的新项目（Spring Batch和HyperSQL数据库）。以下清单显示了 `pom.xml` 选择Maven时创建的文件： ~~~ <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.4.3</version> <relativePath/>  </parent> <groupId>com.example</groupId> <artifactId>batch-processing</artifactId> <version>0.0.1-SNAPSHOT</version> <name>batch-processing</name> <description>Demo project for Spring Boot</description> <properties> <java.version>1.8</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId> </dependency> <dependency> <groupId>org.hsqldb</groupId> <artifactId>hsqldb</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-test</artifactId> <scope>test</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> </project> ~~~ 如果使用Gradle，请访问 [Spring Initializr](https://start.spring.io/#!type=gradle-project&language=java&platformVersion=2.4.3.RELEASE&packaging=jar&jvmVersion=1.8&groupId=com.example&artifactId=batch-processing&name=batch-processing&description=Demo%20project%20for%20Spring%20Boot&packageName=com.example.batch-processing&dependencies=batch,hsql) 以生成具有所需依赖项的新项目（Spring Batch和HyperSQL数据库）。以下清单显示了 `build.gradle`选择Gradle时创建的文件： ~~~ plugins { id 'org.springframework.boot' version '2.4.3' id 'io.spring.dependency-management' version '1.0.11.RELEASE' id 'java' } group = 'com.example' version = '0.0.1-SNAPSHOT' sourceCompatibility = '1.8' repositories { mavenCentral() } dependencies { implementation 'org.springframework.boot:spring-boot-starter-batch' runtimeOnly 'org.hsqldb:hsqldb' testImplementation 'org.springframework.boot:spring-boot-starter-test' testImplementation 'org.springframework.batch:spring-batch-test' } test { useJUnitPlatform() } ~~~ ### 手动初始化（可选）如果要手动初始化项目而不是使用前面显示的链接，请按照以下步骤操作： 1. 导航到 [https://start.spring.io](https://start.spring.io) 。该服务提取应用程序所需的所有依赖关系，并为您完成大部分设置。 2. 选择Gradle或Maven以及您要使用的语言。本指南假定您选择了Java。 3. 单击“ **依赖关系”，** 然后选择“ **Spring Batch** 和 **HyperSQL数据库”** 。 4. 点击 **生成** 。 5. 下载生成的ZIP文件，该文件是使用您的选择配置的Web应用程序的存档。如果您的IDE集成了Spring Initializr，则可以从IDE中完成此过程。 ## 创建业务类现在您可以看到数据输入和输出的格式，接下来可以编写代码来代表一行数据，如以下示例所示（来自 `src/main/java/com/example/batchprocessing/Person.java`）显示： ~~~ package com.example.batchprocessing; public class Person { private String lastName; private String firstName; public Person() { } public Person(String firstName, String lastName) { this.firstName = firstName; this.lastName = lastName; } public void setFirstName(String firstName) { this.firstName = firstName; } public String getFirstName() { return firstName; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } @Override public String toString() { return "firstName: " + firstName + ", lastName: " + lastName; } } ~~~ 您可以实例化 `Person` 类，可以通过构造函数或通过设置属性来使用名字和姓氏。 ## 创建一个中间处理器批处理中的常见范例是摄取数据，对其进行转换，然后将其通过管道传输到其他地方。在这里，您需要编写一个简单的转换器，将名称转换为大写。以下清单（来自 `src/main/java/com/example/batchprocessing/PersonItemProcessor.java`）显示了如何执行此操作： ~~~ package com.example.batchprocessing; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.batch.item.ItemProcessor; public class PersonItemProcessor implements ItemProcessor<Person, Person> { private static final Logger log = LoggerFactory.getLogger(PersonItemProcessor.class); @Override public Person process(final Person person) throws Exception { final String firstName = person.getFirstName().toUpperCase(); final String lastName = person.getLastName().toUpperCase(); final Person transformedPerson = new Person(firstName, lastName); log.info("Converting (" + person + ") into (" + transformedPerson + ")"); return transformedPerson; } } ~~~ `PersonItemProcessor` 实现Spring Batch的 `ItemProcessor`界面。这样可以很容易地将代码连接到批处理作业中，您将在本指南的稍后部分中进行定义。根据界面，您会收到一个传入的 `Person` 对象，之后将其转换为大写 `Person`. 输入和输出类型不必相同。实际上，在读取一个数据源之后，有时应用程序的数据流需要另一种数据类型。 ## 汇总批处理作业现在，您需要将实际的批处理作业放在一起。 Spring Batch提供了许多实用程序类，这些实用程序类减少了编写自定义代码的需求。相反，您可以专注于业务逻辑。要配置您的工作，您必须首先创建一个Spring `@Configuration` 类类似于下面的示例 `src/main/java/com/exampe/batchprocessing/BatchConfiguration.java`: ~~~ @Configuration @EnableBatchProcessing public class BatchConfiguration { @Autowired public JobBuilderFactory jobBuilderFactory; @Autowired public StepBuilderFactory stepBuilderFactory; ... } ~~~ 对于初学者， `@EnableBatchProcessing`注释添加了许多关键豆，这些豆可以支持工作并为您节省很多工作。本示例使用基于内存的数据库（由 `@EnableBatchProcessing`），表示完成后数据就消失了。它还为下面需要的几个工厂自动接线。现在将以下豆添加到您的 `BatchConfiguration` 定义读取器，处理器和写入器的类： ~~~ @Bean public FlatFileItemReader<Person> reader() { return new FlatFileItemReaderBuilder<Person>() .name("personItemReader") .resource(new ClassPathResource("sample-data.csv")) .delimited() .names(new String[]{"firstName", "lastName"}) .fieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{ setTargetType(Person.class); }}) .build(); } @Bean public PersonItemProcessor processor() { return new PersonItemProcessor(); } @Bean public JdbcBatchItemWriter<Person> writer(DataSource dataSource) { return new JdbcBatchItemWriterBuilder<Person>() .itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>()) .sql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)") .dataSource(dataSource) .build(); } ~~~ 第一部分代码定义了输入，处理器和输出。 * `reader()` 创建一个 `ItemReader`。它寻找一个名为 `sample-data.csv` 并解析每个订单项，并提供足够的信息以将其转化为 `Person`. * `processor()` 创建一个实例 `PersonItemProcessor` 您之前定义的意思是将数据转换为大写。 * `writer(DataSource)` 创建一个 `ItemWriter`。这是针对JDBC目标的，并自动获取由创建的dataSource的副本。 `@EnableBatchProcessing`。它包括插入单个所需的SQL语句 `Person`，由Java bean属性驱动。最后一块（来自 `src/main/java/com/example/batchprocessing/BatchConfiguration.java`）显示了实际的作业配置： ~~~ @Bean public Job importUserJob(JobCompletionNotificationListener listener, Step step1) { return jobBuilderFactory.get("importUserJob") .incrementer(new RunIdIncrementer()) .listener(listener) .flow(step1) .end() .build(); } @Bean public Step step1(JdbcBatchItemWriter<Person> writer) { return stepBuilderFactory.get("step1") .<Person, Person> chunk(10) .reader(reader()) .processor(processor()) .writer(writer) .build(); } ~~~ 第一种方法定义了作业，第二种方法定义了一个步骤。作业是按步骤构建的，其中每个步骤都可以涉及阅读器，处理器和编写器。在此作业定义中，您需要一个增量器，因为作业使用数据库来维护执行状态。然后，您列出每个步骤（尽管此作业只有一个步骤）。作业结束，并且Java API产生了配置完美的作业。在步骤定义中，您定义一次要写入多少数据。在这种情况下，它一次最多写入10条记录。接下来，使用先前注入的Bean配置读取器，处理器和写入器。 chunk() 带前缀 <Person,Person>因为它是通用方法。这表示每个“块”处理的输入和输出类型，并与 ItemReader<Person> 和 ItemWriter<Person>. 批处理配置的最后一位是在作业完成时获得通知的方法。以下示例（摘自 `src/main/java/com/example/batchprocessing/JobCompletionNotificationListener.java`）显示了这样的类： ~~~ package com.example.batchprocessing; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.batch.core.BatchStatus; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.listener.JobExecutionListenerSupport; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.jdbc.core.JdbcTemplate; import org.springframework.stereotype.Component; @Component public class JobCompletionNotificationListener extends JobExecutionListenerSupport { private static final Logger log = LoggerFactory.getLogger(JobCompletionNotificationListener.class); private final JdbcTemplate jdbcTemplate; @Autowired public JobCompletionNotificationListener(JdbcTemplate jdbcTemplate) { this.jdbcTemplate = jdbcTemplate; } @Override public void afterJob(JobExecution jobExecution) { if(jobExecution.getStatus() == BatchStatus.COMPLETED) { log.info("!!! JOB FINISHED! Time to verify the results"); jdbcTemplate.query("SELECT first_name, last_name FROM people", (rs, row) -> new Person( rs.getString(1), rs.getString(2)) ).forEach(person -> log.info("Found <" + person + "> in the database.")); } } } ~~~ 这 `JobCompletionNotificationListener` 侦听工作是什么时候 `BatchStatus.COMPLETED` 然后使用 `JdbcTemplate` 检查结果。 ## 使应用程序可执行尽管批处理可以嵌入到Web应用程序和WAR文件中，但是下面演示的更简单的方法创建了一个独立的应用程序。您将所有内容打包在一个可执行的JAR文件中，由一个好的旧Java驱动 `main()` 方法。 Spring Initializr为您创建了一个应用程序类。对于这个简单的示例，它无需进一步修改即可工作。以下清单（来自 `src/main/java/com/example/batchprocessing/BatchProcessingApplication.java`）显示了应用程序类： ~~~ package com.example.batchprocessing; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; @SpringBootApplication public class BatchProcessingApplication { public static void main(String[] args) throws Exception { System.exit(SpringApplication.exit(SpringApplication.run(BatchProcessingApplication.class, args))); } } ~~~ `@SpringBootApplication` 是一个方便注释，它添加了以下所有内容： * `@Configuration`：将类标记为应用程序上下文的Bean定义的源。 * `@EnableAutoConfiguration`：告诉Spring Boot根据类路径设置，其他bean和各种属性设置开始添加bean。例如，如果 `spring-webmvc` 在类路径上，此注释将应用程序标记为Web应用程序并激活关键行为，例如设置 `DispatcherServlet`. * `@ComponentScan`：告诉Spring在服务器中寻找其他组件，配置和服务 `com/example` 包，让它找到控制器。这 `main()` 方法使用Spring Boot的 `SpringApplication.run()`启动应用程序的方法。您是否注意到没有一行XML？没有 `web.xml`文件。该Web应用程序是100％纯Java，因此您无需处理任何管道或基础结构。注意 `SpringApplication.exit()` 和 `System.exit()`确保JVM在作业完成时退出。有关请参见的 [Spring Boot Reference文档中 Application Exit部分](https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#boot-features-application-exit) 更多详细信息，。出于演示目的，有一些代码可以创建一个 `JdbcTemplate`，查询数据库，并打印出批处理作业插入的人员的姓名。 ### 建立可执行的JAR 您可以使用Gradle或Maven从命令行运行该应用程序。您还可以构建一个包含所有必需的依赖项，类和资源的可执行JAR文件，然后运行该文件。生成可执行jar使得在整个开发生命周期中，跨不同环境等等的情况下，都可以轻松地将服务作为应用程序进行发布，版本控制和部署。如果您使用Gradle，则可以通过使用以下命令运行该应用程序 `./gradlew bootRun`。或者，您可以通过使用以下命令构建JAR文件： `./gradlew build` 然后运行JAR文件，如下所示： ~~~ java -jar build/libs/gs-batch-processing-0.1.0.jar ~~~ 如果您使用Maven，则可以通过使用以下命令运行该应用程序 `./mvnw spring-boot:run`。或者，您可以使用以下命令构建JAR文件： `./mvnw clean package` 然后运行JAR文件，如下所示： ~~~ java -jar target/gs-batch-processing-0.1.0.jar ~~~ 此处描述的步骤将创建可运行的JAR。您还可以构建经典的WAR文件。这项工作为每个要转变的人打印一行。作业运行后，您还可以查看查询数据库的输出。它应该类似于以下输出： ~~~ Converting (firstName: Jill, lastName: Doe) into (firstName: JILL, lastName: DOE) Converting (firstName: Joe, lastName: Doe) into (firstName: JOE, lastName: DOE) Converting (firstName: Justin, lastName: Doe) into (firstName: JUSTIN, lastName: DOE) Converting (firstName: Jane, lastName: Doe) into (firstName: JANE, lastName: DOE) Converting (firstName: John, lastName: Doe) into (firstName: JOHN, lastName: DOE) Found <firstName: JILL, lastName: DOE> in the database. Found <firstName: JOE, lastName: DOE> in the database. Found <firstName: JUSTIN, lastName: DOE> in the database. Found <firstName: JANE, lastName: DOE> in the database. Found <firstName: JOHN, lastName: DOE> in the database. ~~~ ## 概括恭喜你！您构建了一个批处理作业，该作业从电子表格中提取数据，对其进行处理，然后将其写入数据库。