Hbase thrift mapreducepekerjaan

Tapis

Carian terbaru saya
Tapis mengikut:
Bajet
hingga
hingga
hingga
Jenis
Kemahiran
Bahasa
    Status Pekerjaan
    1,619 hbase thrift mapreduce tugasan ditemui, harga dalam USD

    Content based recommendation system using MapReduce, i.e. given a job description you should be able to suggest a set of applicable courses

    $172 (Avg Bid)
    $172 Avg Bida
    5 bida

    I want to run pouchdb-node on AWS Lambda. Source code: Detailed Requirements: - Deploy pouchdb-node to AWS Lambda. - Use EFS in storage layer. - Ok to limit concurrency to 1 to avoid race conditions. - Expose via Lambda HTTPS Endpoints (no API Gateway) - The basic PUT / GET functions, replication, and MapReduce must all work Project Deliverables: - Deployment script which packages pouchdb-node and deploys it to AWS using SAM or CloudFormation. Development Process: - I will not give access to my AWS Accounts. - You develop on your own environment and give me completed solution.

    $187 (Avg Bid)
    $187 Avg Bida
    8 bida
    Data Engineer Tamat left

    ...oriented discussion. Must Have: ● At least 6+ years of total IT experience ● At least 4+ years of experience in design and development using Hadoop technology stack and programming languages ● Hands-on experience in 2 or more areas: o Hadoop, HDFS, MR o Spark Streaming, Spark SQL, Spark ML o Kafka/Flume. o Apache NiFi o Worked with Hortonworks Data Platform o Hive / Pig / Sqoop o NoSQL Databases HBase/Cassandra/Neo4j/MongoDB o Visualisation & Reporting frameworks like D3.js, Zeppellin, Grafana, Kibana Tableau, Pentaho o Scrapy for crawling websites o Good to have knowledge of Elastic Search o Good to have understanding of Google Analytics data streaming. o Data security (Kerberos/Open LDAP/Knox/Ranger) ● Should have a very good overview of the current landscape and ability to...

    $29 / hr (Avg Bid)
    $29 / hr Avg Bida
    6 bida

    ...to you how you pick necessary features and build the training that creates matching courses for job profiles. These are the suggested steps you should follow : Step 1: Setup a Hadoop cluster where the data sets should be stored on the set of Hadoop data nodes. Step 2: Implement a content based recommendation system using MapReduce, i.e. given a job description you should be able to suggest a set of applicable courses. Step 3: Execute the training step of your MapReduce program using the data set stored in the cluster. You can use a subset of the data depending on the system capacity of your Hadoop cluster. You have to use an appropriate subset of features in the data set for effective training. Step 4: Test your recommendation system using a set of requests that execute ...

    $133 (Avg Bid)
    $133 Avg Bida
    5 bida

    ...metrics to show which is a better method. OR ii) Improvement on the methodology used in (a) that will produce a better result. 2. Find a suitable paper on replication of data in hadoop mapreduce framework. a) Implement the methodology used in the paper b) i) Write a program to split identified intermediate results from (1 b(i)) appropriately into 64Mb/128Mb and compare with 2(a) using same metrics to show which is a better method. OR ii) Improvement on the methodology used in 2(a) that will produce a better result 3. Find a suitable paper on allocation strategy of data/tasks to nodes in Hadoop Mapreduce framework. a) Implement the methodology used in the paper b) i) Write a program to reallocate the splits from (2 (b(i)) above to nodes by considering the capability ...

    $158 (Avg Bid)
    $158 Avg Bida
    5 bida

    ... SQL Concepts, Data Modelling Techniques & Data Engineering Concepts is a must Hands on experience in ETL process, Performance optimization techniques is a must. Candidate should have taken part in Architecture design and discussion. Minimum of 2 years of experience in working with batch processing/ real-time systems using various technologies like Databricks, HDFS, Redshift, Hadoop, Elastic MapReduce on AWS, Apache Spark, Hive/Impala and HDFS, Pig, Kafka, Kinesis, Elasticsearch and NoSQL databases Minimum of 2 years of experience working in Datawarehouse or Data Lake Projects in a role beyond just Data consumption. Minimum of 2 years of extensive working knowledge in AWS building scalable solutions. Equivalent level of experience in Azure or Google Cloud is also acceptable M...

    $1864 (Avg Bid)
    $1864 Avg Bida
    17 bida

    Data Engineers 6+ yrs : At least 6+ years of total IT experience ● At least 4+ years of experience in design and development using Hadoop technology stack and programming languages ● Hands-on experience in 2 or more areas: o Hadoop, HDFS, MR o Spark Streaming, Spark SQL, Spark ML o Kafka/Flume. o Apache NiFi Worked with Hortonworks Data Platform o Hive / Pig / Sqoop o NoSQL Databases HBase/Cassandra/Neo4j/MongoDB o Visualisation & Reporting frameworks like D3.js, Zeppellin, Grafana, Kibana Tableau, Pentaho o Scrapy for crawling websites o Good to have knowledge of Elastic Search o Good to have understanding of Google Analytics data streaming. o Data security (Kerberos/Open LDAP/Knox/Ranger) ● Should have a very good overview of the current landscape and ability t...

    $1956 (Avg Bid)
    $1956 Avg Bida
    2 bida

    Aparupa: As discussed: We would like the following information: Company Name Company Contact Phone Number Company Contact Email Company address For: Antique Stores Antique Dealers Thrift Shops Antique Malls Within a 250 mile radius of zip code 12305. We would like this detail provided in an excel format. The milestones for this project would be a completed list of as many contacts that can be delivered in a 72-hour period from time of acceptance of the award to manage this project.

    $150 (Avg Bid)
    $150 Avg Bida
    1 bida

    LDAP service on Oracle Linux 7.3 with rpm packages. LDAP for HDP-2.5.3.0 3. Install and configure Ranger HDP service security on Hbase and Solr tables and collection and validate the security with 5 users

    $156 (Avg Bid)
    $156 Avg Bida
    1 bida

    ...taking advantage of the CI/CD pipelines when possible - Help with troubleshooting and configuration fine-tuning on several platforms (apache, haddoop, hbase etc) - Build and maintain a local testing environment replica for developers. - Help plan for "non hyper cloud" deployments. OpenStack, ProxMox, Kubernetes. All are on the table but the most "appropriate" one must be selected considering the architecture and CI/CD capabilities. - Build and maintain "on prem" alternatives of the AWS structure. This will include hardware planing (server) but also deployment of several VMs (or containers at some point) with techs including php+nginx, hadoop with hbase (and phoenix), sql database (probably mysql) and CEPH object storage. - Be the technical cha...

    $17 / hr (Avg Bid)
    $17 / hr Avg Bida
    17 bida

    The purpose of this project is to develop a working prototype of a network monitoring and reporting Platform that receives network health and status, traffic data from several network infrastructure monitoring sources, and produces an aggr...Platform that receives network health and status, traffic data from several network infrastructure monitoring sources, and produces an aggregate of network status data for processing by a data analytics engine. This prototype will be known as NetWatch. The NetWatch solution will utilize data processing and analytics services via the Hadoop infrastructure, and data reporting features of the Hbase or MYSQL/Datameer tool. The prototype will be used by the Network A&E team to determine its viability as a working engine for network status ...

    $8 - $19
    $8 - $19
    0 bida

    Hi Mohd. I hope you are well, I have some Big Data exercises (hive, pig, sed and mapreduce) I would like to know if you can help me

    $80 (Avg Bid)
    $80 Avg Bida
    1 bida

    consiste en desarrollar un esquema maestro-trabajador, como el esquema de procesamiento mas habitual en los entornos de computacion distribuida, parecido a modelos tan famosos como mapReduce

    $86 (Avg Bid)
    $86 Avg Bida
    4 bida

    Please have a look at the below stack. 1. Bash Scripting. 2. Hive 3. Scala Spark 4. HBase and other regular big data technologies.

    $509 (Avg Bid)
    Tempatan
    $509 Avg Bida
    17 bida

    - Backup HBase database on internal infrastructure

    $16 / hr (Avg Bid)
    $16 / hr Avg Bida
    3 bida

    We are looking for a machine learning engineer who must have the following experience: 1. python coding: +7 years of experience 2. Machine Leaning: +5 years of experience (Scikit-Learn, TensorFlow, Caffe, MXNet, Keras, XGBoost) 3. AI/Deep Learning: +5 years of experience 4. Cloud computing: AWS, S3, EC2, EMR, SageMaker, ECS, Lambda, IAM 5. distributed computing technology: Hadoop, Spark, HBase, Hive / Impala, or any similar technology Should be an independent developer, NO CONSULTING COMPANY There will be series of technical interview about python coding, machine learning, AI , cloud computing. Candidate must have an excellent skill in python coding and be able to answer challenging python questions during the interview

    $55 / hr (Avg Bid)
    $55 / hr Avg Bida
    13 bida

    Design, code, test Hive, Sqoop, HBase, Yarn, UNIX Shell scripting Spark and Scala mandatory You should have working experience in previous projects not a beginner level projects so please be ready to design develop and fix the bugs. Working hours and all We can decide over the chat.

    $55 / hr (Avg Bid)
    $55 / hr Avg Bida
    4 bida

    1) Develop an aggregate of these reviews using your knowledge of Hadoop and MapReduce in Microsoft HDInsight. a) Follow the same approach as the Big Data Analytics Workshop (using the wordcount method in HDInsight) to determine the contributory words for each level of rating. b) Present the workflow of using HDInsight (you may use screen captures) along with a summary of findings and any insights for each level of rating. MapReduce documentation for HDInsight is available here 2) Azure data bricks for some insights Provide the following: a) A screen capture of the completed model diagram and any decision you made in training the model. For example, rationale for some of the components used, how many records have been used for training and how many for testing. b) A set of ...

    $148 (Avg Bid)
    $148 Avg Bida
    7 bida

    I am living in Iraq, and I would like to have a creative and passionate person to help me make content that emphasize human connection worldwide and the concept that brotherhood can cross the man-made borders. Here's an example: Last time, I went into a thrift store here, and I found a framed photo of a couple having a fancy dinner date. I picked it up after filming it within the other stuff. I would like to start a series where I can use the power of the internet to search for the people, and hopefully tell their stories, and also get the picture back to them. I need to work with someone who's really passionate about human connection and share with me the passion of creating something on TikTok. Thank you

    $22 (Avg Bid)
    $22 Avg Bida
    25 bida

    Hi Asma, this is for the grey label and all file types associated with it.

    $59 (Avg Bid)
    $59 Avg Bida
    1 bida

    am trying to run hbase backup command and got below error root@machine:~/hbase-2.4.12# hbase backup Error: Could not find or load main class backup Caused by: : backup need to fix that some tips below : Hbase install below Just enable the configuration on xml file and start the hbase and confirm is working well run Hbase on linux Ubuntu some helps below:

    $13 / hr (Avg Bid)
    $13 / hr Avg Bida
    3 bida

    moving data from wkc to atlas. There is an issue in one of the category relationship mapping

    $88 (Avg Bid)
    $88 Avg Bida
    3 bida

    I am living in Iraq, and I would like to have a creative and passionate person to help me make content that emphasize human connection worldwide and the concept that brotherhood can cross the man-made borders. Here's an example: Last time, I went into a thrift store here, and I found a framed photo of a couple having a fancy dinner date. I picked it up after filming it within the other stuff. I would like to start a series where I can use the power of the internet to search for the people, and hopefully tell their stories, and also get the picture back to them. I need to work with someone who's really passionate about human connection and share with me the passion of creating something on TikTok. Thank you

    $15 (Avg Bid)
    $15 Avg Bida
    24 bida

    Roles And R...high-performance web services for data tracking. High-speed querying. Managing and deploying HBase. Being a part of a POC effort to help build new Hadoop clusters. Test prototypes and oversee handover to operational teams. Propose best practices/standards. Skills Required: Good knowledge in back-end programming, specifically java, JS, Node.js and OOAD Good knowledge of database structures, theories, principles, and practices. Ability to write Pig Latin scripts. Hands on experience in HiveQL. Familiarity with data loading tools like Flume, Sqoop. Knowledge of workflow/schedulers like Oozie. Analytical and problem solving skills, applied to Big Data domain Proven understanding with Hadoop, HBase, Hive, Pig, and HBase. Good aptitude in multi-threading and...

    $11 / hr (Avg Bid)
    $11 / hr Avg Bida
    1 bida

    Hi Tapasi K., I noticed your profile and would like to offer you my project. Write a spark submit job that accesses a data in a hive table in one hadoop/spark cluster , accesses data in an hbase table in another hadoop cluster , combine( do some aggregation) this data and save result in both hive and hbase. P.S. Hive is in a different hadoop cluster than hbase ( both in same network / VPC subnet ) .

    $100 (Avg Bid)
    $100 Avg Bida
    1 bida

    consiste en desarrollar un esquema maestro-trabajador, como el esquema de procesamiento mas habitual en los entornos de computacion distribuida, parecido a modelos tan famosos como mapReduce

    $51 (Avg Bid)
    $51 Avg Bida
    1 bida

    I am looking for a java developer who is -familiar with hadoop architecture and mapreduce scheduling -familiar with modifying the open source packages

    $261 (Avg Bid)
    $261 Avg Bida
    5 bida

    ...7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a prog...

    $211 (Avg Bid)
    $211 Avg Bida
    7 bida

    ...7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a prog...

    $22 (Avg Bid)
    $22 Avg Bida
    4 bida

    ...7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a prog...

    $12 (Avg Bid)
    $12 Avg Bida
    5 bida
    data project Tamat left

    ...7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a prog...

    $144 (Avg Bid)
    $144 Avg Bida
    6 bida
    data progect Tamat left

    ...7910/DVN/HG7NV7 4. Design, implement and run an Oozie workflow to find out a. the 3 airlines with the highest and lowest probability, respectively, of being on schedule; b. the 3 airports with the longest and shortest average taxi time per flight (both in and out), respectively; and c. the most common reason for flight cancellations. • Requirements: 1. Your workflow must contain at least three MapReduce jobs that run in fully distributed mode. 2. Run your workflow to analyze the entire data set (total 22 years from 1987 to 2008) at one time on two VMs first and then gradually increase the system scale to the maximum allowed number of VMs for at least 5 increment steps, and measure each corresponding workflow execution time. 3. Run your workflow to analyze the data in a prog...

    $10 (Avg Bid)
    $10 Avg Bida
    3 bida

    We need to hire a Hadoop and Spark expert. Tasks to be done: - Configure properly Hadoop cluster in HA mode - Configure properly Spark cluster in HA mode - Install and configure HBase - Install and configure Oozie - Install and configure SSL for all the tools mentioned above. - Configure authentication for all the tools mentioned above. Installation will be done in an on-premise environment. Linux based OS (centos 9) will be used. All the Hadoop and Spark software will be the full open source version. We are not using Cloudera, Hortonworks, MapR or similars. Project will be payed by an hourly rate for the amount of time it takes to finish the tasks mentioned above. Only tech folks with experience will be considered! :)

    $57 / hr (Avg Bid)
    $57 / hr Avg Bida
    2 bida

    Hey, I'm looking to get a quote for someone who can do Roblox programming and environment creation as well as scripting. What we're after is a modified version of this - The above only allows for donations in the form of robux, but what we need is for people to be able to donate clothing items as well as any other items they may have in their inventory. If straight donation is not possible in roblox we could also do a trading system where everytime they donate an item they get a custom token from the thrift store in return. I would love to have this quote asap as we are soon going into discovery phase for our clients.

    $3004 (Avg Bid)
    $3004 Avg Bida
    10 bida

    Familiarity with Hadoop ecosystem and its components: obviously, a must! Ability to write reliable, manageable, and high-performance code Expertise knowledge of Hadoop HDFS, Hive, Pig, Flume and Sqoop. Working experience in HQL Experience of writing Pig Latin and MapReduce jobs Good knowledge of the concepts of Hadoop. Analytical and problem-solving skills; the implementation of these skills in Big Data domain Understanding of data loading tools such as Flume, Sqoop etc Good knowledge of database principles, practices, structures, and theories

    $641 (Avg Bid)
    $641 Avg Bida
    2 bida

    - Existing infrastructure needs to be backed up with Ansible - Should have knowledge of the following technologies - Ansible - Terraform - Docker - Kubernetes - Postgres - HBase - Gitlab

    $11 / hr (Avg Bid)
    $11 / hr Avg Bida
    4 bida

    Using ansible, harvest twitter data with geo coordinates using twitter API and put into a couchDB. The CouchDB setup may be a single node or based on a cluster setup. The cloud based solution should use 4 VMs with 8 virtual CPUs and 500Gb of volume storage. The data is then combined with other useful geographic data to produce some visualization summary results using MapReduce.

    $109 (Avg Bid)
    $109 Avg Bida
    8 bida

    Write a MapReduce program to analyze the income data extracted from the 1990 U.S. Census data and determine whether most Americans make more than $50,000 or $50,000 or less a year in 1990. Provide the number of people who made more than $50,000 and the number of people who made $50,000 or less. Download data from http://archive.ics.uci.edu/ml/datasets/Census+Income

    $162 (Avg Bid)
    Segera
    $162 Avg Bida
    7 bida

    ...full product life-cycles • Coding skills in JavaScript with a strong base in object-oriented design and functional programming • Strong Experience in Node.Js and React.Js web framework • Understanding of basic data structures & algorithms • Experienced with relational databases(MySQL, Postgres, etc) good working knowledge of SQL Experience with non-relational databases (MongoDb, Cassandra, Hbase, DynamoDb), designing schemas • Experience in API design and best practices • Experience in building microservices-based architectures • Strong experience on any of frameworks such as Express, Koa, Sails, StrongLoop etc. • Web fundamentals like HTML5 and CSS3 • Good design and prototyping skills • Ability to technically l...

    $1597 (Avg Bid)
    $1597 Avg Bida
    9 bida

    1 Explain the concept of Big Data and its importance in a modern economy 2 Explain the core architecture and algorithms underpinning big data processing 3 Analyse and visualize large data sets using a range of statistical and big data technologies 4 Critically evaluate, select and employ appropriate tools and technologies for the development of big data applications

    $19 - $161
    Dimeterai Perjanjian Kerahsiaan
    $19 - $161
    2 bida

    Big Data task with the use of python and hadoop using mapreduce techniques

    $16 (Avg Bid)
    $16 Avg Bida
    6 bida

    Hadoop, Implementation of MapReduce application

    $15 (Avg Bid)
    $15 Avg Bida
    7 bida

    Parsing, Cleaning, and Profiling of the attached file by removing hashtags, emoticons, or any redundant data which is not useful for analysis. And MapReduce output will be on HDFS like the image attached named "Output" but should be clean. Tasks: Dataset: Programming: MapReduce with Java Data profiling: Write MapReduce java code to characterize (profile) the data in each column. Data cleaning: Cleaning and Profiling the tweets by removing hashtags, emoticons, or any redundant data which is not useful for analysis. Write MapReduce java code to ETL (extract, transform, load) data source. Drop some unimportant columns, Normalize data in a column, and Detect badly formatted rows.

    $20 (Avg Bid)
    $20 Avg Bida
    1 bida
    Trophy icon Design my Logo Tamat left

    Hi designers! My name is Victoria, and I am in the process of starting a new thrift store business. The name of my thrift store will be "Share & Care Thrift." I want to commission a circular logo in colour and black and white. I will also need a second rectangular, extended version of the logo for outdoor signage visible from the street. The logo is for a thrift store, so I'd like the design to include elements that incorporate the environment or reusing items. The word "share" in the store name evokes the image of objects passing between hands. I don't have any colour preferences for the logo, though I would like it to look professional and stand out.

    $97 (Avg Bid)
    Dijamin
    $97
    380 penyertaan

    ...con l’architettura utilizzata in tutta l’azienda. Competenze richieste - Laurea in Informatica, Information Technology o equivalente esperienza tecnica. - Almeno 3 anni di esperienza professionale. - Profonda conoscenza ed esperienza in statistica. - Previa esperienza in programmazione, preferibilmente in Python, Kafka o Java e volontà di apprende nuovi linguaggi. - Competenze su Hadoop v2, MapReduce, HDFS. - Buona conoscenza dei Big Data querying tools. - Esperienza con Spark. -Esperienza nel processare grandi quantità di dati, sia strutturati che non, inclusa l’integrazione di dati che provengono da fonti diverse. - Esperienza con NoSQL databases, come Cassandra o MongoDB. - Esperienza con vari sistemi di messagistica, come Kafka o RabbitMQ Du...

    $22 / hr (Avg Bid)
    $22 / hr Avg Bida
    6 bida

    I need some help with a small task completing some beginning steps in Hadoop with python. Come to the chat and I can explain more. It will not take long, the only thing you need is virtualbox and some som python & Hadoop knowledge.

    $21 (Avg Bid)
    $21 Avg Bida
    4 bida

    Cleaning and Profiling the tweets by removing hashtags, emoticons, or any redundant data which is not useful for analysis. Organize the use... or any redundant data which is not useful for analysis. Organize the user_location column in a common standard format. Dataset has been attached. Or you can get it from the link below: Tasks: Data profiling: Write MapReduce java code to characterize (profile) the data in each column. Data cleaning: Cleaning and Profiling the tweets by removing hashtags, emoticons, or any redundant data which is not useful for analysis. Write MapReduce java code to ETL (extract, transform, load) data source. Drop some unimportant columns, Normalize data in a column, and Detect badly formatted rows.

    $24 (Avg Bid)
    $24 Avg Bida
    2 bida

    Detailed summary must contain the main theme of the paper, the approach considered for the work, limitation, current trend in this area and your own judgement on the weakness of the paper. The article is attached separately with this assignment. Summary must include the following: - Understand the contribution of the paper - Understand the technologies - Analyse the current Trend with respect to each paper - Identify the drawback of the paper - Any alternative improvement - Follow IEEE reference style Must be: Excellent in explanation of problem understanding, explanation of Technologies, explanation of Scope of the work, explanation of limitation of the work, explanation of improvements

    $25 (Avg Bid)
    $25 Avg Bida
    17 bida

    Configure hadoop and perform word count on an input file by using mapreduce on multiple nodes (for example - 1 master and 2 slave nodes).Compare the results obtained by changing the block size each time.

    $62 (Avg Bid)
    $62 Avg Bida
    4 bida

    The most vital thing in applying the MapReduce framework to real-world problems is to identify what the keys and values are. While there are more advanced approaches, the following hint is a naïve method for inspiring your creativity. You can use the candidate median strings (of a total of 65536) as the keys, and the total matching distances of the respective candidates as the values. That means you will not get the keys from input but generate the keys (i.e., enumerating the candidate median strings) through your code on the fly. Your Map function outputs each median string paired with its total matching distance; your Reduce function reverses each key/value pair such as <k,v>  <v,k>. The output of Reduce will be a sorted list of the reversed pairs and the first...

    $100 (Avg Bid)
    $100 Avg Bida
    7 bida