T
- The specific parser settings configuration class, which can potentially provide additional configuration options supported by the parser implementation.public abstract class AbstractParser<T extends CommonParserSettings<?>> extends Object
It handles all settings defined by CommonParserSettings
, and delegates the parsing algorithm implementation to its subclasses through the abstract method parseRecord()
The following (absolutely required) attributes are exposed to subclasses:
CharInputReader
): the character input provider that reads characters from a given input into an internal bufferParserOutput
): the output handler for every record parsed from the input. Implementors must use this object to handle the input (such as appending characters and notifying of values parsed)CsvParser
,
CsvParserSettings
,
FixedWidthParser
,
FixedWidthParserSettings
,
CharInputReader
,
ParserOutput
Modifier and Type | Field and Description |
---|---|
protected char |
ch |
protected Map<Long,String> |
comments |
protected ParsingContext |
context |
protected CharInputReader |
input |
protected String |
lastComment |
protected ParserOutput |
output |
protected Processor |
processor |
protected RecordFactory |
recordFactory |
protected T |
settings |
protected int |
whitespaceRangeStart |
Constructor and Description |
---|
AbstractParser(T settings)
All parsers must support, at the very least, the settings provided by
CommonParserSettings . |
Modifier and Type | Method and Description |
---|---|
void |
beginParsing(File file)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(File file,
Charset encoding)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(File file,
String encoding)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(InputStream input)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(InputStream input,
Charset encoding)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(InputStream input,
String encoding)
Starts an iterator-style parsing cycle.
|
void |
beginParsing(Reader reader)
Starts an iterator-style parsing cycle.
|
protected boolean |
consumeValueOnEOF()
Allows the parser implementation to handle any value that was being consumed when the end of the input was reached
|
protected ParsingContext |
createParsingContext() |
ParsingContext |
getContext()
Returns the current parsing context with information about the status of the parser at any given time.
|
protected InputAnalysisProcess |
getInputAnalysisProcess()
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.
|
RecordMetaData |
getRecordMetadata()
Returns the metadata associated with
Record s parsed from the input using parseAllRecords(File) or parseNextRecord() . |
protected boolean |
inComment() |
protected void |
initialize() |
IterableResult<String[],ParsingContext> |
iterate(File input)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(File input,
Charset encoding)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(File input,
String encoding)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(InputStream input)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(InputStream input,
Charset encoding)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(InputStream input,
String encoding)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<String[],ParsingContext> |
iterate(Reader input)
Provides an
IterableResult for iterating rows parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(File input)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(File input,
Charset encoding)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(File input,
String encoding)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(InputStream input)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(InputStream input,
Charset encoding)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(InputStream input,
String encoding)
Provides an
IterableResult for iterating records parsed from the input. |
IterableResult<Record,ParsingContext> |
iterateRecords(Reader input)
Provides an
IterableResult for iterating records parsed from the input. |
void |
parse(File file)
Parses the entirety of a given file and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(File file,
Charset encoding)
Parses the entirety of a given file and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(File file,
String encoding)
Parses the entirety of a given file and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(InputStream input)
Parses the entirety of a given input and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(InputStream input,
Charset encoding)
Parses the entirety of a given input and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(InputStream input,
String encoding)
Parses the entirety of a given input and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
void |
parse(Reader reader)
Parses the entirety of a given input and delegates each parsed row to an instance of
RowProcessor , defined by CommonParserSettings.getRowProcessor() . |
List<String[]> |
parseAll(File file)
Parses all records from a file and returns them in a list.
|
List<String[]> |
parseAll(File file,
Charset encoding)
Parses all records from a file and returns them in a list.
|
List<String[]> |
parseAll(File file,
String encoding)
Parses all records from a file and returns them in a list.
|
List<String[]> |
parseAll(InputStream input)
Parses all records from an input stream and returns them in a list.
|
List<String[]> |
parseAll(InputStream input,
Charset encoding)
Parses all records from an input stream and returns them in a list.
|
List<String[]> |
parseAll(InputStream input,
String encoding)
Parses all records from an input stream and returns them in a list.
|
List<String[]> |
parseAll(Reader reader)
Parses all records from the input and returns them in a list.
|
List<Record> |
parseAllRecords(File file)
Parses all records from a file and returns them in a list.
|
List<Record> |
parseAllRecords(File file,
Charset encoding)
Parses all records from a file and returns them in a list.
|
List<Record> |
parseAllRecords(File file,
String encoding)
Parses all records from a file and returns them in a list.
|
List<Record> |
parseAllRecords(InputStream input)
Parses all records from an input stream and returns them in a list.
|
List<Record> |
parseAllRecords(InputStream input,
Charset encoding)
Parses all records from an input stream and returns them in a list.
|
List<Record> |
parseAllRecords(InputStream input,
String encoding)
Parses all records from an input stream and returns them in a list.
|
List<Record> |
parseAllRecords(Reader reader)
Parses all records from the input and returns them in a list.
|
String[] |
parseLine(String line)
Parses a single line from a String in the format supported by the parser implementation.
|
String[] |
parseNext()
Parses the next record from the input.
|
Record |
parseNextRecord()
Parses the next record from the input.
|
protected abstract void |
parseRecord()
Parser-specific implementation for reading a single record from the input.
|
Record |
parseRecord(String line)
Parses a single line from a String in the format supported by the parser implementation.
|
protected void |
processComment() |
protected void |
reloadHeaders()
Reloads headers from settings.
|
void |
stopParsing()
Stops parsing and closes all open resources.
|
protected final T extends CommonParserSettings<?> settings
protected final ParserOutput output
protected ParsingContext context
protected Processor processor
protected CharInputReader input
protected char ch
protected RecordFactory recordFactory
protected String lastComment
protected final int whitespaceRangeStart
public AbstractParser(T settings)
CommonParserSettings
. The AbstractParser requires its configuration to be properly initialized.settings
- the parser configurationprotected void processComment()
public final void parse(Reader reader)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.reader
- The input to be parsed.protected abstract void parseRecord()
The AbstractParser handles the initialization and processing of the input until it is ready to be parsed.
It then delegates the input to the parser-specific implementation defined by parseRecord()
. In general, an implementation of parseRecord()
will perform the following steps:
CharAppender
) so the next call to output.appender.append(ch) will be store the character of the next parsed value Once the parseRecord()
returns, the AbstractParser takes over and handles the information (generally, reorganizing it and passing it on to a RowProcessor
).
After the record processing, the AbstractParser reads the next characters from the input, delegating control again to the parseRecord() implementation for processing of the next record.
This cycle repeats until the reading process is stopped by the user, the input is exhausted, or an error happens.
In case of errors, the unchecked exception TextParsingException
will be thrown and all resources in use will be closed automatically. The exception should contain the cause and more information about where in the input the error happened.
CharInputReader
,
CharAppender
,
ParserOutput
,
TextParsingException
,
RowProcessor
protected boolean consumeValueOnEOF()
public final void beginParsing(Reader reader)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.reader
- The input to be parsed.protected ParsingContext createParsingContext()
protected void initialize()
protected InputAnalysisProcess getInputAnalysisProcess()
InputAnalysisProcess
. By default, null
is returned and no special input analysis will be performed.public final void stopParsing()
public final List<String[]> parseAll(Reader reader)
reader
- the input to be parsedprotected boolean inComment()
public final String[] parseNext()
beginParsing(Reader)
must have been invoked once before calling this method.
If the end of the input is reached, then this method will return null. Additionally, all resources will be closed automatically at the end of the input or if any error happens while parsing.protected final void reloadHeaders()
public final Record parseRecord(String line)
line
- a line of text to be parsedRecord
containing the values parsed from the input linepublic final String[] parseLine(String line)
line
- a line of text to be parsedpublic final void parse(File file)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.file
- The file to be parsed.public final void parse(File file, String encoding)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.file
- The file to be parsed.encoding
- the encoding of the filepublic final void parse(File file, Charset encoding)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.file
- The file to be parsed.encoding
- the encoding of the filepublic final void parse(InputStream input)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.input
- The input to be parsed. The input stream will be closed automatically.public final void parse(InputStream input, String encoding)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.input
- The input to be parsed. The input stream will be closed automatically.encoding
- the encoding of the input streampublic final void parse(InputStream input, Charset encoding)
RowProcessor
, defined by CommonParserSettings.getRowProcessor()
.input
- The input to be parsed. The input stream will be closed automatically.encoding
- the encoding of the input streampublic final void beginParsing(File file)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.file
- The file to be parsed.public final void beginParsing(File file, String encoding)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.file
- The file to be parsed.encoding
- the encoding of the filepublic final void beginParsing(File file, Charset encoding)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.file
- The file to be parsed.encoding
- the encoding of the filepublic final void beginParsing(InputStream input)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.input
- The input to be parsed. The input stream will be closed automatically in case of errors.public final void beginParsing(InputStream input, String encoding)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.input
- The input to be parsed. The input stream will be closed automatically in case of errors.encoding
- the encoding of the input streampublic final void beginParsing(InputStream input, Charset encoding)
RowProcessor
is provided in the configuration, it will be used to perform additional processing.
The parsed records must be read one by one with the invocation of parseNext()
.
The user may invoke @link stopParsing()
to stop reading from the input.input
- The input to be parsed. The input stream will be closed automatically in case of errors.encoding
- the encoding of the input streampublic final List<String[]> parseAll(File file)
file
- the input file to be parsedpublic final List<String[]> parseAll(File file, String encoding)
file
- the input file to be parsedencoding
- the encoding of the filepublic final List<String[]> parseAll(File file, Charset encoding)
file
- the input file to be parsedencoding
- the encoding of the filepublic final List<String[]> parseAll(InputStream input)
input
- the input stream to be parsed. The input stream will be closed automaticallypublic final List<String[]> parseAll(InputStream input, String encoding)
input
- the input stream to be parsed. The input stream will be closed automaticallyencoding
- the encoding of the input streampublic final List<String[]> parseAll(InputStream input, Charset encoding)
input
- the input stream to be parsed. The input stream will be closed automaticallyencoding
- the encoding of the input streampublic final List<Record> parseAllRecords(File file)
file
- the input file to be parsedpublic final List<Record> parseAllRecords(File file, String encoding)
file
- the input file to be parsedencoding
- the encoding of the filepublic final List<Record> parseAllRecords(File file, Charset encoding)
file
- the input file to be parsedencoding
- the encoding of the filepublic final List<Record> parseAllRecords(InputStream input)
input
- the input stream to be parsed. The input stream will be closed automaticallypublic final List<Record> parseAllRecords(InputStream input, String encoding)
input
- the input stream to be parsed. The input stream will be closed automaticallyencoding
- the encoding of the input streampublic final List<Record> parseAllRecords(InputStream input, Charset encoding)
input
- the input stream to be parsed. The input stream will be closed automaticallyencoding
- the encoding of the input streampublic final List<Record> parseAllRecords(Reader reader)
reader
- the input to be parsedpublic final Record parseNextRecord()
beginParsing(Reader)
must have been invoked once before calling this method.
If the end of the input is reached, then this method will return null. Additionally, all resources will be closed automatically at the end of the input or if any error happens while parsing.public final ParsingContext getContext()
public final RecordMetaData getRecordMetadata()
Record
s parsed from the input using parseAllRecords(File)
or parseNextRecord()
.Record
s generated with the current input.public final IterableResult<String[],ParsingContext> iterate(File input, String encoding)
IterableResult
for iterating rows parsed from the input.input
- the input File
encoding
- the encoding of the input File
public final IterableResult<String[],ParsingContext> iterate(File input, Charset encoding)
IterableResult
for iterating rows parsed from the input.input
- the input File
encoding
- the encoding of the input File
public final IterableResult<String[],ParsingContext> iterate(File input)
IterableResult
for iterating rows parsed from the input.input
- the input File
public final IterableResult<String[],ParsingContext> iterate(Reader input)
IterableResult
for iterating rows parsed from the input.input
- the input Reader
iterable
over the results of parsing the Reader
public final IterableResult<String[],ParsingContext> iterate(InputStream input, String encoding)
IterableResult
for iterating rows parsed from the input.input
- the the InputStream
with contents to be parsedencoding
- the character encoding to be used for processing the given input.public final IterableResult<String[],ParsingContext> iterate(InputStream input, Charset encoding)
IterableResult
for iterating rows parsed from the input.input
- the the InputStream
with contents to be parsedencoding
- the character encoding to be used for processing the given input.public final IterableResult<String[],ParsingContext> iterate(InputStream input)
IterableResult
for iterating rows parsed from the input.input
- the the InputStream
with contents to be parsedpublic final IterableResult<Record,ParsingContext> iterateRecords(File input, String encoding)
IterableResult
for iterating records parsed from the input.input
- the input File
encoding
- the encoding of the input File
public final IterableResult<Record,ParsingContext> iterateRecords(File input, Charset encoding)
IterableResult
for iterating records parsed from the input.input
- the input File
encoding
- the encoding of the input File
public final IterableResult<Record,ParsingContext> iterateRecords(File input)
IterableResult
for iterating records parsed from the input.input
- the input File
public final IterableResult<Record,ParsingContext> iterateRecords(Reader input)
IterableResult
for iterating records parsed from the input.input
- the input Reader
public final IterableResult<Record,ParsingContext> iterateRecords(InputStream input, String encoding)
IterableResult
for iterating records parsed from the input.input
- the the InputStream
with contents to be parsedencoding
- the character encoding to be used for processing the given input.public final IterableResult<Record,ParsingContext> iterateRecords(InputStream input, Charset encoding)
IterableResult
for iterating records parsed from the input.input
- the the InputStream
with contents to be parsedencoding
- the character encoding to be used for processing the given input.public final IterableResult<Record,ParsingContext> iterateRecords(InputStream input)
IterableResult
for iterating records parsed from the input.input
- the the InputStream
with contents to be parsedCopyright © 2018 uniVocity Software Pty Ltd. All rights reserved.