
Data Processing with Optimus
By :

Data types are the soul of a dataframe: they define how a value is represented in memory and, more importantly, how much memory it will use. Every dataframe technology supported in Optimus has different data types aimed to represent specific data. The most common are numeric values, string values, and datetime values. You can find which data types are supported in each technology by going to its respective website or documentation. This information can be found in the Further reading section of this chapter.
Besides internal data representation, Optimus tries to enrich the data to give the user a better overview of how it can be wrangled. For example, when you see a column that's of the email type, internally, it is just a string column, but when the profiled is requested, it gives us feedback about how many mismatches (data points that do not match the type) are on a column. We'll talk more about the profiler later in this book.