Lets start with Levenshtein distance algorithm to compare two texts.  Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. Get clear explanation this algorithm at wiki.

Method to find Levenshtein Distance

    public static int LevenshteinDistance(String s0, String s1) {

        int len0 = s0.length() + 1;
        int len1 = s1.length() + 1;

        // the array of distances
        int[] cost = new int[len0];
        int[] newcost = new int[len0];

        // initial cost of skipping prefix in String s0
        for (int i = 0; i < len0; i++)
            cost[i] = i;

        // dynamicaly computing the array of distances

        // transformation cost for each letter in s1
        for (int j = 1; j < len1; j++) {

            // initial cost of skipping prefix in String s1
            newcost[0] = j - 1;

            // transformation cost for each letter in s0
            for (int i = 1; i < len0; i++) {

                // matching current letters in both strings
                int match = (s0.charAt(i - 1) == s1.charAt(j - 1)) ? 0 : 1;

                // computing cost for each transformation
                int cost_replace = cost[i - 1] + match;
                int cost_insert = cost[i] + 1;
                int cost_delete = newcost[i - 1] + 1;

                // keep minimum cost
                newcost[i] = Math.min(Math.min(cost_insert, cost_delete),
                        cost_replace);
            }

            // swap cost/newcost arrays
            int[] swap = cost;
            cost = newcost;
            newcost = swap;
        }

        // the distance is the cost for transforming all letters in both strings
        return cost[len0 - 1];
    }

Percentage of Text Match 

public static int pecentageOfTextMatch(String s0, String s1) {
        int percentage = 0;
        // Trim and remove duplicate spaces
        s0 = s0.trim().replaceAll("\\s+", " ");
        s1 = s1.trim().replaceAll("\\s+", " ");
        percentage=(int) (100 - (float) LevenshteinDistance(s0, s1) * 100 / (float) (s0.length() + s1.length()));
        return percentage;
    }

Percentage of Match between array of Strings 

  1. Get as0, as1 (arrary of Strings)
  2. Calculate String frequency of as0, as1 with HashMaps hm0, hm1
  3. Calculate frequency difference of hm0, hm1 with diff HashMap 
  4. Calculate total frequency difference ( Summation of  diff frequencies and hm1 frequencies)
  5. Calculate percentage of  match
 public static int pecentageOfMatch(String[] as0, String[] as1) {
        int n = as0.length;
        Integer temp = null;
        
        // String frequency of as0 
        HashMap<String, Integer> hm0 = new HashMap<String, Integer>();
        for (int i = 0; i < n; i++) {
            temp = hm0.get(as0[i]);
            if (temp == null) {
                hm0.put(as0[i], new Integer(1));
            } else {
                hm0.put(as0[i], new Integer(temp.intValue() + 1));
            }
        }

        // String frequency of as1
        n = as1.length;
        HashMap<String, Integer> hm1 = new HashMap<String, Integer>();
        for (int i = 0; i < n; i++) {
            temp = hm1.get(as1[i]);
            if (temp == null) {
                hm1.put(as1[i], new Integer(1));
            } else {
                hm1.put(as1[i], new Integer(temp.intValue() + 1));
            }
        }

        // Frequency difference between hm0 and hm1 to diff
        HashMap<String, Integer> diff = new HashMap<String, Integer>();
        String key;
        Integer value, value1, rval;
        Iterator it = hm0.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry<String, Integer> pairs = (Map.Entry<String, Integer>) it
                    .next();
            key = pairs.getKey();
            value = pairs.getValue();
            value1 = hm1.get(key);
            it.remove();
            hm1.remove(key);
            if (value1 != null)
                rval = new Integer(Math.abs(value1.intValue()
                        - value.intValue()));
            else
                rval = value;
            diff.put(key, rval);
        }

        // Sum all remaining String frequencies in hm1
        int val = 0;
        it = hm1.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry<String, Integer> pairs = (Map.Entry<String, Integer>) it
                    .next();
            val += pairs.getValue().intValue();
        }
        
        // Sum all frequencies in diff
        it = diff.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry<String, Integer> pairs = (Map.Entry<String, Integer>) it
                    .next();
            val += pairs.getValue().intValue();
        }

        // Calculate word match percentage
        int per = (int) ((((float) val * 100)) / ((float) (as0.length + as1.length)));
        per = 100 - per;
        return per;
    }

Percentage of Word Match :

It separates two sentences into words and it will give result of that words matching
    public static int pecentageOfWordMatch(String s0, String s1) {
        // Trim and Replace all . ? ! with spaces to make easy to split to words 
        s0 = s0.trim().replaceAll("[.?!]", " ");
        s1 = s1.trim().replaceAll("[.?!]", " ");
        //Split by space
        String[] as0 = s0.split(" ");
        String[] as1 = s1.split(" ");
        return pecentageOfMatch(as0, as1);
    }

Percentage of Sentence Match :

It separates two Texts into Sentences and it will give result of that sentences matching
    public static int pecentageOfSentenceMatch(String s0, String s1) {
        // Trim and Replace all . ? ! with ". " to make easy to split to sentences
        s0 = s0.trim().replaceAll("[.?!]", ". ");
        s1 = s1.trim().replaceAll("[.?!]", ". ");
        //Split by ". "
        String[] as0 = s0.split("(?i)(?<=[.])\\s+(?=[a-zA-Z])");
        String[] as1 = s1.split("(?i)(?<=[.])\\s+(?=[a-zA-Z])");
        return pecentageOfMatch(as0, as1);
    }

Test

String s0 = "I am engineer and I work here.I am here";
String s1 = "I am here";
System.out.println(LevenshteinDistance(s0, s1));
System.out.println(pecentageOfTextMatch(s0, s1));
System.out.println(pecentageOfWordMatch(s0, s1));
System.out.println(pecentageOfSentenceMatch(s0, s1));

36 comments:

  1. A percentage calculator can be handy on a great many occasions in one's daily life. Such a calculator program may be used to provide assistance with everyday functions. Take the example of a trip to the store where you are trying to maximize special discounts or coupons. Using your head to perform certain calculations may be challenging if not inaccurate. A percentage calculator is specifically created to help an individual in instances like these. Check out: percentage calculator app

    ReplyDelete
  2. excellent post as always i like it thank you for sharing

    คาสิโนออนไลน์ที่น่าเชื่อถือและมีความเป็นมืออาชีพที่สุดในตอนนี้
    โปรโมชั่นGclub ของทางทีมงานตอนนี้แจกฟรีโบนัส 50%
    เพียงแค่คุณสมัคร สล็อตออนไลน์ กับทางทีมงานของเราเพียงเท่านั้น
    ร่วมมาเป็นส่วนหนึ่งกับเว็บไซต์คาสิโนออนไลน์ของเราได้เลยค่ะ
    สมัครสล็อตออนไลน์ >>> Goldenslot
    สนใจร่วมสนุกกับ คาสิโนออนไลน์ คลิ๊กได้เลย
    มีทั้งคาสิโนออนไลน์ หวยออนไลน์ ฟุตบอลออนไลน์ สล็อตออนไลน์ และอื่นๆอีกมากมาย

    ReplyDelete
  3. Good blog. Keep sharing. I love them Are you also searching for ajman assignment help? we are the best solution for you. We are best known for delivering writing services to students without having to break the bank

    ReplyDelete
  4. Great Post, Thanks for sharing.
    There is a video that explains the Levenshtein distance algorithm its in Spanish but it is very good.

    Part 1 - https://www.youtube.com/watch?v=4oTFJOQpmRY
    Part 2 - https://www.youtube.com/watch?v=83PnEZNsa-8

    ReplyDelete
  5. Strings can be compared via interning. However, there are considerable issues with doing so and. In most circumstances, it is not recommended. Anyway, I’m not just here because of that, I’m also here because there are something that I want to share with you. Here is the game called mystic messenger pc. If you want to communicate with other people around the world, then this game is for you. Also, visit Codigames website if you want to download awesome games for free of charge.


    ReplyDelete
  6. What is Google hangout? Google started out as a search engine over two decades ago. Since then, the company has continually expanded its services and applications. Nowadays, digital communication would be almost unimaginable without Google and its many services. One such service is Hangouts, a messaging app launched in 2013. In this article, you’ll learn what Google Hangouts is and how you can use this communication tool.

    ReplyDelete
  7. Let us know that kgto-lbs.comj The kilo is the base unit of mass in the International System of Units (SI), the metric system, with the unit symbol kilo. It is a widely used measure in science, engineering and commerce around the world.

    ReplyDelete
  8. What is Clip Art? The name clip art remained, and continues to be used, even though the old clipped images have been replaced by computer graphics. Many images are available for free, while others require a fee.

    ReplyDelete
  9. Microsoft Windows is a very friendly, popular and most used operating system. It is a graphical interface operating system which was developed by a famous IT company named Microsoft Corporation.For more information What is Microsoft Windows? visit our link

    ReplyDelete
  10. Input Devices of Computer System- A input device is a hardware device that sends data to a computer, allowing you to interact with and control the computer. Any device through which we input anything into a computer or personal computer is called an input device.

    ReplyDelete
  11. How to format Laptop? Yes formatting your laptop would make it faster. It will clean your computer’s hard drive and wipe all the cache files. What’s more, if you format your laptop and upgrade it to the latest version of Windows, it would bring you an even better result.

    ReplyDelete

  12. Short for Microsoft Disk Operating System, MS-DOS is a non-graphical command line operating system derived what is ms dos from x86-DOS created for IBM compatible computers. MS-DOS allows the user to navigate, open, and otherwise manipulate files on their computer from the command line instead of a GUI like Windows.

    ReplyDelete
  13. How to Unprotect Excel Sheet? If you’ve ever tried to copy and paste information from an Excel sheet that you’re not supposed to be able to change, you may have discovered that the original author has protected it from editing. Follow these six easy methods to learn how to unprotect an Excel sheet with ease and in no time at all.

    ReplyDelete
  14. What is Storage Device? A storage device is any computing hardware that is used to store, port, and extract data files and objects. It can hold and store information both temporarily and permanently and be internal or external to a computer, server, or any similar computing device.

    ReplyDelete

  15. The hex color code gives the color selector by clicking and dragging your cursor inside the picker area to highlight a color on the right.

    ReplyDelete
  16. Use this tool to find a specific color within an image or to generate a color palette from an image. Once you have selected an image, you can see the most dominant colors within, as well as, select areas of the image to get that locations color.For more information click html color codes on this link

    ReplyDelete
  17. The rgb color picker picker is used to apply colors to an image. And it helps us to find the color on the image.

    ReplyDelete
  18. Through computer solve, you can get the solution of all types of problems related to computer.

    ReplyDelete
  19. Unit converter is used to convert common units of measurement. Plus 77 other converters covering the assortment of units. You can click on "convert unit" here to get more information about it.

    ReplyDelete
  20. If you want to activate the Disney plus channel on the streaming device, you can get this channel by visiting the disneyplus.com/begin and get the channel using simple steps.

    ReplyDelete

  21. Download and install disney plus on your device using the simplest setup procedure from disneyplus.com/begin. Create your account and choose your subscription plan and you are ready to stream with disneyplus.

    ReplyDelete

  22. All you need is to visit disneyplus.com/begin and create your disney plus account and choose your subscription plan.

    ReplyDelete

  23. By visiting disneyplus.com/begin, you may watch on-demand action flicks, feature movies, and amusing material for children and adolescents.

    ReplyDelete
  24. Login to your turbotax.ca/download account. Access your order history and TurboTax desktop software downloads, unlocks, and tax documents.Plug in the machine and power it on. Connect the machine to your computer with the USB cord or pair it via Bluetooth.

    ReplyDelete
  25. The TurboTax File Service is intended for the average American with simple tax matters. When you shop for software you need to install turboTax with license code. one can easily install TurboTax on Windows 10 with CD.

    ReplyDelete
  26. Disney Plus is the most preferred streaming service in the world because it can be connected to any device whether you use web browser, android, mac, LG tv, Samsung tv or any other smart tv. All you need to get started with it is to visit disneyplus.com/begin and follow the simple setup and activation procedure.

    ReplyDelete

  27. Visit disneyplus.com/begin and you will get a complete setup procedure for your disney plus streaming player. Create your account and download disney plus on your smart tv using the simple given here.

    ReplyDelete
  28. Enter the 8-digit code you see on your TV to activate the Disney Plus channel on your device. Sign in to a Disney+ account. Please enter your email and password login details to start streaming movies and TV series from Disney+ Streaming. Click on the link disneyplus.com/begin for more information.

    ReplyDelete

  29. Go to cricut.com/setup in your browser. Download and install Design Space for Desktop.Follow the on-screen instructions at cricut.com/setup, sign in or create your Cricut ID, and set up your new machine.

    ReplyDelete

  30. You have to open the website disneyplus.com/begin . You will be prompted to enter the 8-digit activation code in whatever device you are using.

    ReplyDelete

  31. cricut.com/setup is the setup process from where you can get the software or application to download for your device.

    ReplyDelete
  32. In other words, cricut design space is a free application to interface to a computerized die-cutting device from Cricut. Design Space enables customers to freely download a limitless amount of layouts and ideas. But you need to buy certain photos and fonts from Design Space. Cricut Design Space is the selected methodology of Cricut that allows ...

    ReplyDelete
  33. Sign in to My Downloads at install turbotax with serial codes . Click on the download arrow beside the product you wish to install. Save the download at your preferable destination on your computer...

    ReplyDelete
  34. Amazon mytv is a paid online streaming services used at amazon.com/mytv. It gives users access to a wide range of services like fast delivery, watches unlimited videos online, online movies.

    ReplyDelete
  35. Get drivers and downloads for your dell supportassist for Home PCs. Download and install the latest drivers, firmware and software.

    ReplyDelete
  36. Download and Install the Cricut Design Space via cricut.com/setup to use the application for your craft project. to setup your Cricut.

    ReplyDelete

Blogroll

Popular Posts