project summary - there are vendors who will send XML files to a staging area (file-system) every 15 minutes ( shown in the attached diagram ), when a XML file is written to the staging area, Nifi will fetch those XML files and write to HDFS, once a XML file is written to HDFS, we have a scala application JAR running in a wildfly server which will get notified ([login to view URL]), based on the notification, that jar will submit a spark application programmitically to YARN resource Manager, YARN resource manager then starts the spark application master on a selected spark node (out of 5 spark nodes) and run the spark driver and eventually run the spark jobs in spark executors which is distributed on the spark worker nodes
the issue we are having now is in Nifi cluster, since we have 2 datacenters, and clients are sending the xml files to a server location in one of the datacenter, how can Nifi fetch these XML files from staging area and proceed with the flow, if one datacenter goes down, how can we redirect users to the other nifi cluster in other datacenter and resume the processing, if so, how we can sync the two staging areas in both datacenters, this is for Active / Active datacenter deployment
2 pekerja bebas membida secara purata $110 untuk pekerjaan ini
Hello There, I think I can help you in this project. Lets have a call to discuss more on the issues you are facing and requirements you want then we can proceed with the actual implementation. Thanks.