Open Monograph Press  1.1
 All Classes Namespaces Functions Variables Groups Pages
Filter Class Reference
Inheritance diagram for Filter:
DataObject CitationListTokenizerFilter CrosswalkFilter DateStringNormalizerFilter Nlm30CitationDemultiplexerFilter Nlm30PersonStringFilter Nlm30PersonStringFilter PersistableFilter Nlm30Openurl10CrosswalkFilter Nlm30NameSchemaPersonStringFilter PersonStringNlm30NameSchemaFilter Nlm30NameSchemaPersonStringFilter PersonStringNlm30NameSchemaFilter CompositeFilter Mods34DescriptionXmlFilter NativeImportExportFilter Nlm30CitationSchemaFilter PersistableTestFilter TemplateBasedFilter XSLTransformationFilter

Public Member Functions

 addError ($message)
 clearErrors ()
execute (&$input)
 Filter ($inputType, $outputType)
 getDisplayName ()
 getErrors ()
getInputType ()
getLastInput ()
getLastOutput ()
getOutputType ()
getRuntimeEnvironment ()
 getSeq ()
 hasErrors ()
 isCompatibleWithRuntimeEnvironment ()
process (&$input)
 setDisplayName ($displayName)
 setRuntimeEnvironment (&$runtimeEnvironment)
 setSeq ($seq)
 setTransformationType (&$inputType, &$outputType)
 supports (&$input, &$output)
 supportsAsInput (&$input)
- Public Member Functions inherited from DataObject
 addSupportedMetadataAdapter ($metadataAdapter)
 DataObject ()
 extractMetadata ($metadataSchema)
 getAdditionalMetadataFieldNames ()
getAllData ()
getData ($key, $locale=null)
 getHasLoadableAdapters ()
 getId ()
 getLocaleMetadataFieldNames ()
getLocalizedData ($key)
 getMetadataFieldNames ($translated=true)
 getSetMetadataFieldNames ($translated=true)
 getSupportedExtractionAdapters ()
 getSupportedInjectionAdapters ()
 getSupportedMetadataSchemas ()
 hasData ($key, $locale=null)
 injectMetadata ($metadataDescription)
 removeSupportedMetadataAdapter ($metadataSchemaName)
 setAllData (&$data)
 setData ($key, $value, $locale=null)
 setHasLoadableAdapters ($hasLoadableAdapters)
 setId ($id)
 upcastTo ($targetObject)

Static Public Member Functions

static supportedRuntimeEnvironmentSettings ()

Public Attributes

 $_errors = array()
 $_runtimeEnvironment = false
- Public Attributes inherited from DataObject
 $_data = array()
 $_extractionAdaptersLoaded = false
 $_hasLoadableAdapters = false
 $_injectionAdaptersLoaded = false
 $_metadataExtractionAdapters = array()
 $_metadataInjectionAdapters = array()

Detailed Description

Class that provides the basic template for a filter. Filters are generic data processors that take in a well-specified data type and return another well-specified data type.

Filters enable us to re-use data transformations between applications. Generic filter implementations can sequence, (de-)multiplex or iterate over other filters. Thereby filters can be nested and combined in many different ways to form complex and easy-to-customize data processing networks or pipelines.

NB: This also means that filters only make sense if they accept and return standardized formats that are understood by other filters. Otherwise the extra implementation effort for a filter won't result in improved code re-use.

Objects from different applications (e.g. Papers and Articles) can first be transformed by an application specific filter into a common format and then be processed by application agnostic import/export filters or vice versa. Filters can be used to pre-process data before it is indexed for search. They also provide a framework to customize the processing applied in citation parsing and lookup (i.e. which parsers and lookup sources should be applied).

Filters can be used stand-alone outside PKP applications.

The following is a complete list of all use-cases that have been identified for filters: 1) Decode/Encode

  • import/export: transform application objects (e.g. an Article object) into structured (rich) data formats (e.g. XML, OpenURL KEV, CSV) or vice versa.
  • parse: transform unstructured clob/blob data (e.g. a Word Document) into application objects (e.g. an Article plus Citation objects) or into structured data formats (e.g. XML).
  • render: transform application objects or structured clob/blob data into an unstructured document (e.g. PDF, HTML, Word Document).

2) Normalize

  • lookup: compare the data of a given entity (e.g. a bibliographic reference) with data from other sources (e.g. CrossRef) and use this to normalize data or improve data quality.
  • harvest: cleanse and normalize incoming meta-data

3) Map

  • cross-walk: transform one meta-data format into another. Meta-data can be represented as structured clob/blob data (e.g. XML) or as application objects (i.e. a MetadataRecord instance).
  • meta-data extraction: retrieve meta-data from OO entities (e.g. an Article) into a standardized meta-data record (e.g. NLM element-citation).
  • meta-data injection: inject data from a standardized meta-data record into application objects.

4) Convert documents

  • binary converters: wrap binary document converters (e.g. antidoc) in a well-defined and re-usable way.

5) Search

  • indexing: pre-process data (extract, tokenize, remove stopwords, stem) for indexing.
  • finding: pre-process queries (parse, tokenize, remove stopwords, stem) to access the index

Definition at line 78 of file

Member Function Documentation

Filter::addError (   $message)

Add a filter error


Definition at line 265 of file

Referenced by GenericSequencerFilter\process(), and ParscitRawCitationNlm30CitationSchemaFilter\process().

Filter::clearErrors ( )

Clear all processing errors.

Definition at line 288 of file

& Filter::execute ( $input)

Filters the given input.

Input and output of this method will be tested for compliance with the filter definition.

NB: sub-classes will not normally override this method.

mixedan input value that is supported by this filter
mixed a valid return value or null if an error occurred during processing

Definition at line 442 of file

Filter::Filter (   $inputType,


Receives input and output type that define the transformation.

See Also
$inputTypestring a string representation of a TypeDescription
$outputTypestring a string representation of a TypeDescription

Definition at line 127 of file

Filter::getDisplayName ( )

Get the display name

NB: The standard implementation of this method will initialize the display name with the filter class name. Subclasses can of course override this behavior by explicitly setting a display name.


Definition at line 155 of file

Referenced by ParscitRawCitationNlm30CitationSchemaFilter\process(), and ParaciteRawCitationNlm30CitationSchemaFilter\process().

Filter::getErrors ( )

Get all filter errors


Definition at line 273 of file

& Filter::getInputType ( )

Get the input type


Definition at line 209 of file

Referenced by getRuntimeEnvironment().

& Filter::getLastInput ( )

Get the last valid input processed by this filter.

This can be used for debugging internal filter state or for access to intermediate results when working with larger filter grids.

NB: The input will be set only after input validation so that you can be sure that you'll always find valid data here.


Definition at line 257 of file

& Filter::getLastOutput ( )

Get the last valid output produced by this filter.

This can be used for debugging internal filter state or for access to intermediate results when working with larger filter grids.

NB: The output will be set only after output validation so that you can be sure that you'll always find valid data here.


Definition at line 237 of file

& Filter::getOutputType ( )

Get the output type


Definition at line 217 of file

Referenced by getRuntimeEnvironment().

& Filter::getRuntimeEnvironment ( )

Get the required runtime environment


Definition at line 313 of file

References getInputType(), and getOutputType().

Filter::getSeq ( )

Get the sequence id


Definition at line 175 of file

Referenced by ParaciteRawCitationNlm30CitationSchemaFilter\process().

Filter::hasErrors ( )

Whether this filter has produced errors.


Definition at line 281 of file

Filter::isCompatibleWithRuntimeEnvironment ( )

Check whether the filter is compatible with the required runtime environment.


Definition at line 393 of file

& Filter::process ( $input)

This method performs the actual data processing. NB: sub-classes must implement this method.

$inputmixed validated filter input data
mixed non-validated filter output or null if processing was not successful.

Definition at line 328 of file

Filter::setDisplayName (   $displayName)

Set the display name


Definition at line 140 of file

Referenced by CitationListTokenizerFilter\CitationListTokenizerFilter(), CrossrefNlm30CitationSchemaFilter\CrossrefNlm30CitationSchemaFilter(), DateStringNormalizerFilter\DateStringNormalizerFilter(), FreeciteRawCitationNlm30CitationSchemaFilter\FreeciteRawCitationNlm30CitationSchemaFilter(), Mods34DescriptionXmlFilter\Mods34DescriptionXmlFilter(), MonographONIX30XmlFilter\MonographONIX30XmlFilter(), NativeXmlPKPAuthorFilter\NativeXmlPKPAuthorFilter(), NativeXmlRepresentationFilter\NativeXmlRepresentationFilter(), NativeXmlSubmissionFileFilter\NativeXmlSubmissionFileFilter(), NativeXmlSubmissionFilter\NativeXmlSubmissionFilter(), NativeXmlUserGroupFilter\NativeXmlUserGroupFilter(), Nlm30CitationSchemaAbntFilter\Nlm30CitationSchemaAbntFilter(), Nlm30CitationSchemaApaFilter\Nlm30CitationSchemaApaFilter(), Nlm30CitationSchemaMlaFilter\Nlm30CitationSchemaMlaFilter(), Nlm30CitationSchemaNlm30XmlFilter\Nlm30CitationSchemaNlm30XmlFilter(), Nlm30CitationSchemaOpenurl10CrosswalkFilter\Nlm30CitationSchemaOpenurl10CrosswalkFilter(), Nlm30CitationSchemaVancouverFilter\Nlm30CitationSchemaVancouverFilter(), Openurl10Nlm30CitationSchemaCrosswalkFilter\Openurl10Nlm30CitationSchemaCrosswalkFilter(), PKPAuthorNativeXmlFilter\PKPAuthorNativeXmlFilter(), PKPSubmissionNlm30XmlFilter\PKPSubmissionNlm30XmlFilter(), PKPUserUserXmlFilter\PKPUserUserXmlFilter(), PubmedNlm30CitationSchemaFilter\PubmedNlm30CitationSchemaFilter(), RepresentationNativeXmlFilter\RepresentationNativeXmlFilter(), SubmissionFileNativeXmlFilter\SubmissionFileNativeXmlFilter(), SubmissionNativeXmlFilter\SubmissionNativeXmlFilter(), UserGroupNativeXmlFilter\UserGroupNativeXmlFilter(), UserXmlPKPUserFilter\UserXmlPKPUserFilter(), WorldcatNlm30CitationSchemaFilter\WorldcatNlm30CitationSchemaFilter(), and XSLTransformationFilter\XSLTransformationFilter().

Filter::setRuntimeEnvironment ( $runtimeEnvironment)

Set the required runtime environment


Definition at line 296 of file

Filter::setSeq (   $seq)

Set the sequence id


Definition at line 167 of file

Filter::setTransformationType ( $inputType,

Set the input/output type of this filter group.

See Also
TypeDescriptionFactory::instantiateTypeDescription() for more details

Definition at line 187 of file

static Filter::supportedRuntimeEnvironmentSettings ( )

Returns a static array with supported runtime environment settings and their default values.


Definition at line 485 of file

Filter::supports ( $input,

Returns true if the given input and output objects represent a valid transformation for this filter.

This check must be type based. It can optionally include an additional stateful inspection of the given object instances.

If the output type is null then only check whether the given input type is one of the input types accepted by this filter.

The standard implementation provides full type based checking. Subclasses must implement any required stateful inspection of the provided objects.


Definition at line 358 of file

Filter::supportsAsInput ( $input)

Returns true if the given input is supported by this filter. Otherwise returns false.

NB: sub-classes will not normally override this method.


Definition at line 383 of file

References DataObject\getData(), and DataObject\hasData().

Member Data Documentation

array Filter::$_errors = array()

a list of errors occurred while filtering

Definition at line 107 of file

RuntimeEnvironment Filter::$_runtimeEnvironment = false

the installation requirements required to run this filter instance, false on initialization.

Definition at line 116 of file

The documentation for this class was generated from the following file: