Filter Class Reference
DataObject CitationListTokenizerFilter CrosswalkFilter DateStringNormalizerFilter Nlm30CitationDemultiplexerFilter Nlm30PersonStringFilter PersistableFilter Nlm30Openurl10CrosswalkFilter Nlm30NameSchemaPersonStringFilter PersonStringNlm30NameSchemaFilter CompositeFilter MetadataDataObjectAdapter Mods34DescriptionXmlFilter Nlm30CitationSchemaFilter PersistableTestFilter TemplateBasedFilter XSLTransformationFilter

Public Member Functions

 addError ($message)
 clearErrors ()
execute (&$input)
 Filter ($inputType, $outputType)
 getDisplayName ()
 getErrors ()
getInputType ()
getLastInput ()
getLastOutput ()
getOutputType ()
getRuntimeEnvironment ()
 getSeq ()
 hasErrors ()
 isCompatibleWithRuntimeEnvironment ()
process (&$input)
 setDisplayName ($displayName)
 setRuntimeEnvironment (&$runtimeEnvironment)
 setSeq ($seq)
 setTransformationType (&$inputType, &$outputType)
 supportedRuntimeEnvironmentSettings ()
 supports (&$input, &$output)
 supportsAsInput (&$input)
- Public Member Functions inherited from DataObject
 addSupportedMetadataAdapter (&$metadataAdapter)
 DataObject ($callHooks=true)
extractMetadata (&$metadataSchema)
 getAdditionalMetadataFieldNames ()
getAllData ()
getData ($key, $locale=null)
 getHasLoadableAdapters ()
 getId ()
 getLocaleMetadataFieldNames ()
getLocalizedData ($key)
 getMetadataFieldNames ($translated=true)
 getSetMetadataFieldNames ($translated=true)
getSupportedExtractionAdapters ()
getSupportedInjectionAdapters ()
getSupportedMetadataSchemas ()
 hasData ($key, $locale=null)
 injectMetadata (&$metadataDescription)
 removeSupportedMetadataAdapter ($metadataSchemaName)
 setAllData (&$data)
 setData ($key, $value, $locale=null)
 setHasLoadableAdapters ($hasLoadableAdapters)
 setId ($id)
upcastTo (&$targetObject)

Additional Inherited Members

- Public Attributes inherited from DataObject
 $_data = array()

Detailed Description

Class that provides the basic template for a filter. Filters are generic data processors that take in a well-specified data type and return another well-specified data type.

Filters enable us to re-use data transformations between applications. Generic filter implementations can sequence, (de-)multiplex or iterate over other filters. Thereby filters can be nested and combined in many different ways to form complex and easy-to-customize data processing networks or pipelines.

NB: This also means that filters only make sense if they accept and return standardized formats that are understood by other filters. Otherwise the extra implementation effort for a filter won't result in improved code re-use.

Objects from different applications (e.g. Papers and Articles) can first be transformed by an application specific filter into a common format and then be processed by application agnostic import/export filters or vice versa. Filters can be used to pre-process data before it is indexed for search. They also provide a framework to customize the processing applied in citation parsing and lookup (i.e. which parsers and lookup sources should be applied).

Filters can be used stand-alone outside PKP applications.

The following is a complete list of all use-cases that have been identified for filters: 1) Decode/Encode

  • import/export: transform application objects (e.g. an Article object) into structured (rich) data formats (e.g. XML, OpenURL KEV, CSV) or vice versa.
  • parse: transform unstructured clob/blob data (e.g. a Word Document) into application objects (e.g. an Article plus Citation objects) or into structured data formats (e.g. XML).
  • render: transform application objects or structured clob/blob data into an unstructured document (e.g. PDF, HTML, Word Document).

2) Normalize

  • lookup: compare the data of a given entity (e.g. a bibliographic reference) with data from other sources (e.g. CrossRef) and use this to normalize data or improve data quality.
  • harvest: cleanse and normalize incoming meta-data

3) Map

  • cross-walk: transform one meta-data format into another. Meta-data can be represented as structured clob/blob data (e.g. XML) or as application objects (i.e. a MetadataRecord instance).
  • meta-data extraction: retrieve meta-data from OO entities (e.g. an Article) into a standardized meta-data record (e.g. NLM element-citation).
  • meta-data injection: inject data from a standardized meta-data record into application objects.

4) Convert documents

  • binary converters: wrap binary document converters (e.g. antidoc) in a well-defined and re-usable way.

5) Search

  • indexing: pre-process data (extract, tokenize, remove stopwords, stem) for indexing.
  • finding: pre-process queries (parse, tokenize, remove stopwords, stem) to access the index

Member Function Documentation

Filter::clearErrors ( )

Clear all processing errors.

& Filter::execute ( $input)

Filters the given input.

Input and output of this method will be tested for compliance with the filter definition.

NB: sub-classes will not normally override this method.

mixedan input value that is supported by this filter
mixed a valid return value or null if an error occurred during processing

Filter::Filter (   $inputType,


Receives input and output type that define the transformation.

See Also
$inputTypestring a string representation of a TypeDescription
$outputTypestring a string representation of a TypeDescription

Filter::getDisplayName ( )

Get the display name

NB: The standard implementation of this method will initialize the display name with the filter class name. Subclasses can of course override this behavior by explicitly setting a display name.


Filter::getErrors ( )

Get all filter errors


& Filter::getInputType ( )
& Filter::getLastInput ( )

Get the last valid input processed by this filter.

This can be used for debugging internal filter state or for access to intermediate results when working with larger filter grids.

NB: The input will be set only after input validation so that you can be sure that you'll always find valid data here.


& Filter::getLastOutput ( )

Get the last valid output produced by this filter.

This can be used for debugging internal filter state or for access to intermediate results when working with larger filter grids.

NB: The output will be set only after output validation so that you can be sure that you'll always find valid data here.


& Filter::getOutputType ( )

Get the output type


& Filter::getRuntimeEnvironment ( )

Get the required runtime environment


Filter::getSeq ( )

Get the sequence id


Filter::hasErrors ( )

Whether this filter has produced errors.


Filter::isCompatibleWithRuntimeEnvironment ( )

Check whether the filter is compatible with the required runtime environment.


Definition at line 375 of file

& Filter::process ( $input)

This method performs the actual data processing. NB: sub-classes must implement this method.

$inputmixed validated filter input data
mixed non-validated filter output or null if processing was not successful.

Filter::setDisplayName (   $displayName)

Set the display name


Filter::setRuntimeEnvironment ( $runtimeEnvironment)

Set the required runtime environment


Filter::setSeq (   $seq)

Set the sequence id


Filter::setTransformationType ( $inputType,

Set the input/output type of this filter group.

See Also
TypeDescriptionFactory::instantiateTypeDescription() for more details

Filter::supportedRuntimeEnvironmentSettings ( )

Returns a static array with supported runtime environment settings and their default values.

PHP4 workaround for missing static class members.


Filter::supports ( $input,

Returns true if the given input and output objects represent a valid transformation for this filter.

This check must be type based. It can optionally include an additional stateful inspection of the given object instances.

If the output type is null then only check whether the given input type is one of the input types accepted by this filter.

The standard implementation provides full type based checking. Subclasses must implement any required stateful inspection of the provided objects.


Filter::supportsAsInput ( $input)

Returns true if the given input is supported by this filter. Otherwise returns false.

NB: sub-classes will not normally override this method.


Definition at line 365 of file

