In Napita a queue acts as a temporary storage buffer that facilitates the seamless transfer of data between processors within a data flow. As data moves from one processor to another, it is temporarily stored in these queues, allowing for efficient management of data flow and ensuring smooth processing. While queues offer several advantages, they may occasionally require maintenance to ensure the integrity and efficiency of the data flow. One common scenario that necessitates attention is when a processor fails due to a corrupted file present in the queue. For example, consider a data flow scenario where files are retrieved from an SFTP location from one processor and further processed by subsequent processors. If a file retrieved from the SFTP location have invalid file format, it can prevent the subsequent processor from executing successfully. In such instances, simply changing the invalid file on the SFTP location may not suffice, as the processor will still attempt to process the previous file, resulting in repeated failures. To address this issue effectively, it becomes essential to empty the queue containing the invalid file and replace it with a valid file from the data source. By doing so, the data flow can resume its operation with the latest and valid data, ensuring accurate processing and preventing further disruptions.
Identify the Queue: Determine which queue in your data flow is holding the invalid file. This queue is typically located between the processor that retrieved the file and the subsequent processor that failed to execute.
Empty the Queue:
Right-click on the queue located before the failing processor.
Select the "Empty Queue" option to remove the corrupted file from the queue.
Re-run the Processor:
Right-click on the processor located prior to the emptied queue.
Choose the "Run Once" option to execute the processor again and list a new file from the source (e.g., SFTP location).
This action ensures that the latest file is listed in the queue for processing.
Verify Queue Contents:
To verify that the queue is now holding the correct file, right-click on the queue.
Select the "List Queues" option. This will display all files currently listed in the queue.
Click on the eye icon next to the file name to view and verify the data of the latest file.
Process the File:
After confirming that the correct file is in the queue, right-click on the subsequent processor that processes the file and click on Run Once button.
Ensure that the file is successfully processed by monitoring the processor's status and logs.
Schedule Processors:
Once the file is successfully processed, you can schedule the processors in your Napita flow as needed.
Verify that the data flow is functioning as expected by monitoring subsequent data processing steps.
By following these troubleshooting steps, you can effectively manage queues in Napita and ensure smooth data processing within your workflows, addressing issues such as Invalid files promptly and efficiently.
This document outlines the Standard Operating Procedure (SOP) for diagnosing and resolving issues where the data exported from HotWax Commerce does not match the required data specifications.
HotWax Commerce uses Napita to transform and export data. If the SQL query in NiFi (Napita) is incorrect, it can result in exporting data that does not meet the client's requirements. This SOP will guide you through the steps to identify and rectify such issues.
Access the Exported Data:
Navigate to the location where the exported data is stored (e.g., SFTP location).
Download and review the exported data file.
Compare with Required Data:
Obtain the data requirements from the client.
Compare the exported data against the required data specifications to identify discrepancies.
Check the Last Sync:
Verify the last sync time to ensure that the latest data has been exported.
Review Recent Changes:
Check for any recent changes in the data requirements or the Napita setup.
Access NiFi:
Log in to Napita Instance
Locate the Relevant Process Groups:
Identify the parent process groups related to the data export.
Drill down to the relevant root process groups where the data transformation occurs.
Stop the Processors:
Right-click on the Napita canvas.
Stop the processors to prevent further data export during troubleshooting.
Access Parameters:
Select the parameters option to open a new module with all existing parameters of the group.
Search for the SQL Query:
Look for the parameter named source.sql.query
.
Review and Modify the SQL Query:
Study the current SQL query to understand its logic.
Modify the SQL query as per the client’s data requirements.
Run the Processors:
Run the processors once to generate a new data export.
Check the results in the SFTP location.
Verify the Data:
Compare the newly exported data with the required data specifications.
Ensure that the data now matches the client's requirements.
Resume Processors:
If the data is accurate, schedule the processors to resume regular operation.
Monitor the first few exports to ensure continued accuracy.
This SOP outlines the steps required to configure and manage SFTP Retry for Fetch SFTP and Put SFTP processors in Apache NiFi, ensuring adherence to best practices. URL:
Navigate to Apache NiFi > Processor Group > Fetch/Put SFTP Processor.
Set the comms.failure
relationship to Retry. Configure the following values:
Number of Retry Attempts: 2
Retry Back Off Policy: Penalize
Retry Back Off Duration: 10 min (default)
Penalty Duration: 30 sec (default)
Add a funnel to the Fetch SFTP Processor.
Redirect the following relationships to the funnel:
comms.failure
permission.denied
not.found
Name the connected relationship: SFTP Fetch Fail
.
The relationship name must match exactly
Set the [failure, reject] relationship to Retry. Configure the following values:
Number of Retry Attempts: 2
Retry Back Off Policy: Penalize
Retry Back Off Duration: 10 min (default)
Penalty Duration: 30 sec (default)
Add a funnel to the Put SFTP Processor.
Redirect the following relationships to the funnel:
failure
reject
Name the connected relationship: SFTP Put Fail
.
The relationship name must match exactly
---
Access the SFTP processor where the files are queued.
Redirect the funnel relationships (SFTP Fetch Fail
or SFTP Put Fail
) back to the original processor by connecting the funnel to the respective processor.
This will create a loop to re-run the failures.
Process all the queued files.
Perform this action for both the Fetch and Put SFTP processors as applicable
Once the queue has been processed and cleared, remove the connection between the funnel and the original processor to prevent an infinite loop in case of future failures.
Ensure the queued files are correctly processed after redirection.
Click on the hamburger icon in NiFi's main navigation bar.
Select Summary.
A new pop-up window titled "NiFi Summary" will appear.
Go to the Connections tab.
Search for the relationships "SFTP Fetch Fail" or "SFTP Put Fail" in the list.
Select By Name.
Sort the Queue (Size) column in descending order by clicking the column header.
Click on the Arrow Icon corresponding to the desired relationship to directly navigate to the associated processor.
Review the queued files for the processor and follow the resolution steps mentioned above to ensure proper processing.
Ensure all relationship names and funnel configurations strictly adhere to the specified formats:
SFTP Fetch Fail
SFTP Put Fail
Check for queued files periodically to prevent bottlenecks in data flow.