### 第1关:清洗HTML文档中无意义数据
```
public Document getDoc(String filePath) throws IOException{
File input = new File(filePath);
// File file=new File("./backups/hotels.ctrip.com_domestic-city-hotel.txt");
Document document =Jsoup.parse(input,"UTF-8","http://www.ctrip.com/");
return document ;
}
/**
* 获取清理后的信息
* @param doc
* @return
*/
public List<String> cleanHTML(Document doc){
List<String> aa=new ArrayList<>();
String bb =Jsoup.clean(doc.toString() ,Whitelist.basic());
String cc =Jsoup.clean(doc.toString() ,Whitelist.simpleText());
aa.add(bb);
aa.add(cc);
return aa;
}
```
### 第2关:获取携程网北京市的所有酒店信息
```
public List<Hotel> getHotle(String hotelResult){
List<Hotel> a = new ArrayList<Hotel>();
JSONObject b = JSONObject.parseObject(hotelResult);
List<Hotel> c = JSON.parseArray(b.getString("hotelPositionJSON"), Hotel.class);
// 增加价格数据
JSONArray hotelsPrice = b.getJSONArray("htllist");
if (hotelsPrice != null && !hotelsPrice.isEmpty()) {
for (int i = 0; i < c.size(); i++) {
JSONObject priceObj = hotelsPrice.getJSONObject(i);
if (priceObj != null && !priceObj.isEmpty()) {
Hotel hotel = c.get(i);
String hotelId = priceObj.getString("hotelid");
double price = priceObj.getDoubleValue("amount");
if (hotel.getId().equals(hotelId)) {
hotel.setPrice(price);
}
}
}
}
a.addAll(c);
return a;
}
```