Data Science for Document Analysis and Understanding

The France Excellence 2018 Summer School " Data Science for Document Analysis and Understanding " will take place from July 1 to 28, 2018 in Université de La Rochelle (1-14) and INRIA (Paris, 15-28).

The school will introduce state of the art techniques using data science to analyze and understand documents and digital contents, from their digitization to their exploitation in various applications such as search engines (information retrieval), machine translation, social network analysis, etc. 

The school will rely on the skills of researches of the organizing institutes and invited European international experts.

The University of La Rochelle is a French higher education and research institution founded in 1993. It has four faculties, ten laboratories and ca. 8200 students and 900 employees. The university was recently reorganized to specialize in three major transformations ongoing within society: sustainability, environment and digitization. The La Rochelle part of the school is organized by the L3i laboratory (Laboratoire Informatique, Image et Interaction) is a 110-person computer science laboratory created in 1993, centered on the study of digital contents that are produced by humans and intended for humans.

Paris. Inria, the French National Institute for computer science and applied mathematics, promotes scientific excellence for technology transfer and society. Graduates from the world's top universities, Inria's 2,600 employees rise to the challenges of digital sciences. Research at Inria is organised in “project teams” which bring together researchers with complementary skills to focus on specific scientific projects. With this open, agile model, Inria is able to explore original approaches with its partners in industry and academia and provide an efficient response to the multidisciplinary and application challenges of the digital transformation. The source of many innovations that add value and create jobs, Inria transfers expertise and research results to companies (startups, SMEs and major groups) in fields as diverse as healthcare, transport, energy, communications, security and privacy protection, smart cities and the factory of the future.



Arrival in France :

You will arrive on July 1st, 2018 at Paris Roissy-Charles-de-Gaulle airport where you will be picked up to go to La Rochelle where you will be welcomed.


1/ La Rochelle


Accomodation will consist in individual rooms, at the and cité Antinéa of CROUS.

Accomodation is a 5 minute walk from the lecture room, a 10 minute walk to the beach, and a 15 minute walk to the old harbour.


Lunch are included during weekdays and during tours. The remaining meals can be cooked in one of the kitchens of the residence, or bought at one of the very many nearby restaurants.

 Local Transportation

There is no need for local transport as your housing, the class room, the old harbour, the center and the beaches are all very near and can easily be reached on foot.

Transfer from La Rochelle to Paris

Transfer from La Rochelle to Paris will be organized during the week-end (July 14th/15th) and will offer the opportunity to visit some of the most beautiful Loire Castles.


2/ Paris


From July 15th to July 28th, the students will be lodged at Cité internationale universitaire de Paris 17, bd Jourdan, 75014 Paris.


Breakfasts will be provided. Lunch will be provided during weekdays and tours.

Local Transportation

5 zones transportation card for Paris and surrounding regions will be provided for each student

Back to China

Your return to China will be Sunday, July 28, 2017 (transfer to the airport from the accommodation in the morning) from Paris Roissy-Charles-de-Gaulle Airport.



1/ La Rochelle


Main School venue :


Pôle communication, multimédia et réseaux 

University of La Rochelle

44, avenue Albert Einstein

17000 La Rochelle, France


The main conference room was recently built in the « Pôle communication, multimédia et réseaux ». It can seat 200 people and is equipped with all the latest recording and broadcasting facilities. It includes 7 seats and microphones on the stage, e.g., for panel discussions. Full details are available at the following url (in French):

Smaller rooms and lab rooms are available for parallel sessions and exercises.


The class room is located 200 meters from the « port des minimes », a 10-minute walk from the old port and the 3 towers, and a 15-minute walk to the closest beach.




Every course block consists of about half lecture time, half lab exercises.


Week 1: 2-6 July

Document Analysis


Prof Josep Lladòs (Computer Vision Centre, Barcelona) :

An Introduction to Document Analysis – 8 hours


Prof. Jean-Marc Ogier and Dr. Petra Krämer Gomez (University of La Rochelle) : 

Document Fraud Detection – 8 hours


Prof. Marcus Liwicki (Technical University of Kaiserslautern) :

Historical document analysis – 8 hours


French culture and French as a foreign langage – 4 hours


Tour 1 :Fort Boyard and the Isle of Aix (see below for details)


Week 2: 9-13 July

Document Understanding: Natural Language Processing and Information Retrieval


Dr. Gaël Lejeune (University of Paris Sorbonne):

Introduction to Natural Language Processing – 8 hours


Prof. Antoine Doucet (Universiy of La Rochelle):

Robust Resource-Free langage analysis – 6 hours


Prof. Jean-Loup Guillaume (University of La Rochelle):

Social Networks Analysis – 6 hours


Dr. Karell Bertet (University of La Rochelle):

Data Mining for Information Retrieval – 6 hours


French culture and French as a foreign langage – 4 hours


Tour 2 : Loire Valley Castles and transfer to Paris, weekend of 14-15 July (see below for details)


2/ Paris & INRIA Saclay


Scientific lectures: 3 days a week in Saclay (Inria building Alan Turing).


Homework and projects: 2 half-days a week in Paris.


French language and culture classes: 2 half-days a week in Paris.


The scientific teaching team is:


Jing-Rebecca Li (INRIA-Saclay, CMAP Ecole Polytechnique, France)

Zoltan Szabo (CMAP Ecole Polytechnique, France)

Balazs Pinter (Eötvös Loránd University, Hungary)



Week 3: 16-20 July


Background Material (Monday, Wednesday, Friday) :

linear algebra and convex optimization.


Data Science (Monday, Wednesday, Friday):

Applications, kernels, classification, kernel PCA/CCA, dependency measures/distances, hypothesis testing, acceleration schemes.


Practical Projects (Tuesday, Thursday)


Professional advising (Tuesday, Thursday) (CV, resume, PhD scholarship application.)



Tour 3 : Guided visit of Paris on July 21st.


Week 4: 23-27 July


Natural Language Processing (NLP) (Monday, Wednesday, Friday) :

Foundations of NLP: corpora, tokenization, stemming, term vector representations, POS tagging, parsing, language models, stopwords, text classification and clustering, dialogue systems. Deep learning in NLP: recurrent neural networks, LSTM, word vector representations, end-to-end dialogue systems.


Practical Projects (Tuesday, Thursday)

Professional advising (Tuesday, Thursday) (CV, resume, PhD scholarship application.)




La Rochelle

Excerpt from the Wikipedia :


La Rochelle is a city in western France and a seaport on the Bay of Biscay, a part of the Atlantic Ocean. Its main touristic feature is the "Vieux Port" ("Old Harbour"), which is at the heart of the city, picturesque and lined with seafood restaurants. The city walls are open to an evening promenade. The old town has been well preserved. From the harbour, boating trips can be taken to the Île d'Aix and Fort Boyard (home to the internationally famous TV show of the same name). Nearby Île de Ré is a short drive to the North. The countryside of the surrounding Charente-Maritime is very rural and full of history (Saintes). To the North is Venise Verte, a marshy area of country, criss-crossed with tiny canals and a popular resort for inland boating. Inland is the country of Cognac and Pineau. The attractive Île de Ré is accessible via a bridge from La Rochelle.


Last, it should be noted that insolation is remarkably high. The student population is about 10,000 out of 80,000 inhabitants. Touristic information from Wikitravel is given at the end of this document.


Your everyday tour: Visiting La Rochelle on foot

Copy/paste of the “See” section of La Rochelle's Wikitravel page:


The Old Port ("Vieux Port")

This is the oldest and also the most picturesque part of La Rochelle. Most of the town buildings are hundreds of years old and very well maintained. The narrow streets and pale stone buildings give the Old Port a distinctly mediterranean quality.

The Three Towers (Tour St. Nicolas, Tour de la Chaine & Tour de la Lanterne)

When visiting the Old Port you cannot fail to notice the the three defensive towers, which guard the harbour. They date back to medieval times when control of the city was contested by both French and English. They are well worth a visit although you will have to be in good health to climb the stairs. The staff are friendly and speak both English and French. All the signs to the various rooms and exhibits are also both in French and English. You'll also be given a returnable booklet containing lots of historical information about what is in each room.

The Aquarium

Nature fans will enjoy the huge aquarium, which can be found within easy walking distance of the harbour. The whole tour takes approximately 2-3 hours to experience and is an excellent activity if the weather outside is poor. Audio devices for various languages are available but all educational notices for each exhibit are also translated into English.

Port des Minimes

One of the biggest ports of pleasure boats in Europe. Be prepared to be blown away by the number of yachts in this enormous port. Also, visit the beach beside the port.




Tour 1: Fort Boyard and the Isle of Aix


During the first week (2-6 July)


This tour will start straight form the old harbour of La Rochelle. A boat trip will take us to the island of Aix, via Fort Boyard.


Fort Boyard is a fort located between the Île-d'Aix and the Île d'Oléron in the Pertuis d'Antioche straits, on the west coast of France and is the filming location for the TV gameshow of the same name. Though a fort on Boyard bank was suggested as early as the 17th century, it was not until the 1800s under Napoleon Bonaparte that work began. Building started in 1801 and was completed in 1857. In 1967, the final scene of the French film Les aventuriers was filmed at the remains of the fort.


Fort Boyard is known worldwide for the TV game show which is produced here and was filmed and tailored for 31 countries under different names. The Chinese version was launched in 2015.


The boat will thus leaves us for some lunch and some hours of freet-time (wallking, cycling, swimming, visiting museums) on the  Île-d'Aix, a charming car-free island famous for having hosted Napoleon, and being at the center of a key naval battle between the British Navy and the Atlantic Fleet of the French Navy in 1809. In 1815, this Island is the last place in France where Napoleon stayed before being exiled to Plymouth and then Saint Helena.


Tour 2: Loire Valley Castles : Chambord, Chenonceau, Cheverny


Weekend of 14/15 July - Departure from La Rochelle and arrival in Paris.


This tour will take us from La Rochelle to the Loire valley and the magnificent castles of the French kings. We will discover three of the Loire Valley's most famous châteaux with an official guide. Visit the majestic Château de Chambord, surrounded by one of the largest forest parks in Europe, the charming Château de Chenonceau, built across the River Cher, and the private, atmospheric Château de Cheverny. Experience what life was like during the Renaissance period in these stunning properties that once accommodated the courts of the Kings of France.



Chambord is one of the largest châteaux in the Loire Valley, nestled in the middle of a huge hunting ground. It was built for King Francis I of France in the 16th century with the help of none other than Leonardo da Vinci. See the amazing double helix staircase, which represented the Tree of Life during the Middle Ages, and the château’s wide panoramic terrace with its sea of chimneys that overlook the grounds.



Chenonceau, also known as the ‘The ladies’ castle’, was built during the 16th Century and has been home to many well-known aristocratic French ladies, such as Diane de Poitiers, Catherine de Medici and the White Queen, Louise de Lorraine. You could almost be dreaming as Chenonceau appears over the River Cher. Its fairytale architecture bears witness to a sophisticated and typically French style of living, with elegantly furnished floral rooms, tapestries, antique paintings and fascinating kitchens housed in the piers of the bridge that supports the château.


The last château, Cheverny, is a stately home decorated with beautiful furnishings dating from the 17th century, with remarkably well-preserved interiors.



Tour 3 : Guided Visit of Paris (To be detailled)


Please find here the 2017 summer school of INRIA







Data Science for Document Understanding 

From July 1st to July 28, 2018 in La Rochelle and in Paris

The total number of places is limited to 40.

The price does not include the trip from China to Paris.

Accommodation, lunches and local transportation are all included. Breakfast and dinner remain at charge of the student. 

The summer school will deliver 5 ECTS

The summer school is reserved to Chinese master students.

To apply, please follow the application procedure on the main page.

Price: 25000 CNY

The French Embassy reserves a limited number of grants to students with an exceptional academic background. Grants cover the tuition fees but do not cover the trip from China to Paris.