-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Learning Pentaho Data Integration 8 CE
By :

By now, you have learned how to do several kinds of calculations which enrich the set of data. There is still another kind of operation that is frequently used; it does not have to do with enriching the data but with discarding or filtering unwanted information. That's the core of this section.
Suppose you have a dataset and you only want to keep the rows that match a condition. To demonstrate how to implement this kind of filtering, we will read a file, build a list of words found in the file, and then filter the nulls or unwanted words. We will split the exercise into two parts:
Let's start by reading a sample file.
Before starting, you'll need at least one text file to play with. The text file used in this tutorial is named smcng10.txt
. Its content is about...