The Power Query UI automatically detects column data types and performs the transformations for you. Typically these steps look like this:
let
Source = Csv.Document(File.Contents("online_retail_II.csv"),[Delimiter=",", Columns=8, Encoding=65001, QuoteStyle=QuoteStyle.None]),
online_retail = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Invoice", type text}, ...etc...}})
in
#"Changed Type"
However, when I am calling Power Query M code from VBA ( see my previous posts) to open any given text file I usually do not know at that time what the column names or data types are. So after PromoteHeaders I have to infer the column data types. There is no built-in PQ function to do that. The following code for a DetectListValuesType() function adapted from Imke Feldmann the BIccountant and Steve de Peijper will infer the data type from a list of values, with a given margin of error allowed. If I subsequently transform the column, any values which are not valid for the majority type, such as text mixed in a column of largely numeric data, would be converted to Error values.
DetectListValuesType =
// Return the majority data type in a list with the given proportion of errors allowed.
(values as list, marginforerror as number) as text =>
let
maxerrors = List.Count(values) * marginforerror, // 0 <= marginforerror < 1
distinctvalues = List.Distinct(values),
NullValuesOrEmptyStringsOnly =
List.NonNullCount(distinctvalues) = 0
or
(List.NonNullCount(distinctvalues) = 1 and distinctvalues{0} = ""),
CheckTypes = {"number", "date", "datetime", "text"},
Env = Record.Combine({[L=values],
[NF = #shared[Number.From]], [DF = #shared[Date.From]],
[DTF = #shared[DateTime.From]], [TF = #shared[Text.From]],
[LT = #shared[List.Transform]], [LS = #shared[List.Select]], [LC = #shared[List.Count]]
}),
NumberOfErrors = List.Transform( {"NF", "DF", "DTF", "TF"},
each Expression.Evaluate("LC(LS(LT(L, each try " & _ & "(_) otherwise ""Error""), each _ = ""Error""))", Env)
),
// Type conversion of null will return null, and not throw an error, so null values do not contribute to the conversion error count.
// If all list values are equal to null, no errors will be detected for any type. type text will be selected in this case
// If list values are simple values (not complex typed values like e.g. {"a","b"}), Text.From() will not throw an error,
// and the number of errors for conversion to text will be 0; i.e. text conversion will meet the zero error criterion
// If values contain dates only (none of the values is with time),
// for both date and datetime conversion the number of errors will be 0 and date will be selected (date has a lower index) as conversion type.
// If at least 1 of the date values include a valid time part, date conversion will throw an error, and datetime conversion will not throw an error;
// hence datetime conversion will be selected.
CheckErrorCriterion = List.Transform(NumberOfErrors, each _ <= maxerrors),
typeindex = List.PositionOf(CheckErrorCriterion, true), // First type which meets the error criterion;
// if none of the types meets the error criterion, typeindex = -1
typestring = if NullValuesOrEmptyStringsOnly or typeindex = -1 then "text" else CheckTypes{typeindex}
in typestring
To create a schema table with the column names and data types of all the columns in a source table, I can use the Table.Profile function with the undocumented parameter optional additionalAggregates as nullable list. The details are given in Lars Schreiber’s post on additionalAggregates. In brief, it is a list of lists, like this: {{}, {}}. Each of these inner lists represents a new calculated profile column and consists of three items:
Output column name as text, typecheck function as function, aggregation function as function
My ProfileAggregates() function code then is as follows:
// return a profiling table with the inferred data type of each column
//Column Min Max Average StandardDeviation Count NullCount DistinctCount ColumnType
(SourceTable as table, numberofrecords as number, marginforerror as number) =>
let
numberofrecords = if numberofrecords<=0 then Table.RowCount(SourceTable) else numberofrecords,
tblProfile = Table.Profile( SourceTable, {{"ColumnType", each true,
each DetectListValuesType( List.FirstN( _, numberofrecords ), marginforerror) }})
in
tblProfile
This function can then be called like this:
ProfileAggregates(online_retail, 10000, .01)
I tried it on the “Online Retail II” data set which has 1,067,371 rows. That’s too many for Excel, but a good test. Excel takes 25 minutes to run the complete analysis which indicates that some performance tuning should be investigated. Here is some of the output; I reordered the columns to put ColumnType after the column name.

If you know how to create the table of column names, types, and other data, directly rather than using Table.Profile, let me know at the PowerBI Community forum, or in the comments here.