LeetCode--393. UTF-8 Validation

2024-04-20 13:18
文章标签 leetcode utf validation 393

本文主要是介绍LeetCode--393. UTF-8 Validation,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

问题链接:https://leetcode.com/problems/utf-8-validation/

UTF-8的编码规则比较复杂,要仔细阅读确定编码规则,自己想的算法也比较暴力朴素,易于理解:

1.将十进制数转为长度为8的定长字符串  2.使用双指针法检查是否合法,代码如下

class Solution {public static boolean validUtf8(int[] data) {// Integer.toBinaryString()String[] strs=new String[data.length];for(int i=0;i<data.length;i++)strs[i]=strs[i]=Integer.toBinaryString(1<<8 | data[i]).substring(1);;int i=0;while(i<data.length){int count=afterBytes(strs[i]);if(count>3 || count<0)return false;for(int k=1;k<=count;k++){if(i+k>=data.length || !isValid(strs[i+k]))return false;}i=i+count+1;}          return true;}public static int afterBytes(String str){int i=0;while(i< str.length()&& str.charAt(i)=='1')i++;if(i==0)return 0;if(i==1)return -1;return i-1;}public static boolean isValid(String str){return str.charAt(0)=='1' && str.charAt(1)=='0';}
}

 

Solutions里面提供了两种基于Bit Manipulation的精彩算法,代码如下:

class Solution {public boolean validUtf8(int[] data) {// Number of bytes in the current UTF-8 characterint numberOfBytesToProcess = 0;// For each integer in the data array.for (int i = 0; i < data.length; i++) {// Get the binary representation. We only need the least significant 8 bits// for any given number.String binRep = Integer.toBinaryString(data[i]);binRep =binRep.length() >= 8? binRep.substring(binRep.length() - 8): "00000000".substring(binRep.length() % 8) + binRep;// If this is the case then we are to start processing a new UTF-8 character.if (numberOfBytesToProcess == 0) {// Get the number of 1s in the beginning of the string.for (int j = 0; j < binRep.length(); j++) {if (binRep.charAt(j) == '0') {break;}numberOfBytesToProcess += 1;}// 1 byte charactersif (numberOfBytesToProcess == 0) {continue;}// Invalid scenarios according to the rules of the problem.if (numberOfBytesToProcess > 4 || numberOfBytesToProcess == 1) {return false;}} else {// Else, we are processing integers which represent bytes which are a part of// a UTF-8 character. So, they must adhere to the pattern `10xxxxxx`.if (!(binRep.charAt(0) == '1' && binRep.charAt(1) == '0')) {return false;}}// We reduce the number of bytes to process by 1 after each integer.numberOfBytesToProcess -= 1;}// This is for the case where we might not have the complete data for// a particular UTF-8 character.return numberOfBytesToProcess == 0;}
}

 

 

class Solution {public boolean validUtf8(int[] data) {// Number of bytes in the current UTF-8 characterint numberOfBytesToProcess = 0;// Masks to check two most significant bits in a byte.int mask1 = 1 << 7;int mask2 = 1 << 6;// For each integer in the data array.for(int i = 0; i < data.length; i++) {// If this is the case then we are to start processing a new UTF-8 character.if (numberOfBytesToProcess == 0) {int mask = 1 << 7;while ((mask & data[i]) != 0) {numberOfBytesToProcess += 1;mask = mask >> 1;}// 1 byte charactersif (numberOfBytesToProcess == 0) {continue;}// Invalid scenarios according to the rules of the problem.if (numberOfBytesToProcess > 4 || numberOfBytesToProcess == 1) {return false;}} else {// data[i] should have most significant bit set and// second most significant bit unset. So, we use the two masks// to make sure this is the case.if (!((data[i] & mask1) != 0 && (mask2 & data[i]) == 0)) {return false;}}// We reduce the number of bytes to process by 1 after each integer.numberOfBytesToProcess -= 1;}// This is for the case where we might not have the complete data for// a particular UTF-8 character.return numberOfBytesToProcess == 0;}
}

 

这篇关于LeetCode--393. UTF-8 Validation的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/920363

相关文章

Spring Validation中9个数据校验工具使用指南

《SpringValidation中9个数据校验工具使用指南》SpringValidation作为Spring生态系统的重要组成部分,提供了一套强大而灵活的数据校验机制,本文给大家介绍了Spring... 目录1. Bean Validation基础注解常用注解示例在控制器中应用2. 自定义约束验证器定义自

spring 参数校验Validation示例详解

《spring参数校验Validation示例详解》Spring提供了Validation工具类来实现对客户端传来的请求参数的有效校验,本文给大家介绍spring参数校验Validation示例详... 目录前言一、Validation常见的校验注解二、Validation的简单应用三、分组校验四、自定义校

哈希leetcode-1

目录 1前言 2.例题  2.1两数之和 2.2判断是否互为字符重排 2.3存在重复元素1 2.4存在重复元素2 2.5字母异位词分组 1前言 哈希表主要是适合于快速查找某个元素(O(1)) 当我们要频繁的查找某个元素,第一哈希表O(1),第二,二分O(log n) 一般可以分为语言自带的容器哈希和用数组模拟的简易哈希。 最简单的比如数组模拟字符存储,只要开26个c

leetcode-24Swap Nodes in Pairs

带头结点。 /*** Definition for singly-linked list.* public class ListNode {* int val;* ListNode next;* ListNode(int x) { val = x; }* }*/public class Solution {public ListNode swapPairs(L

leetcode-23Merge k Sorted Lists

带头结点。 /*** Definition for singly-linked list.* public class ListNode {* int val;* ListNode next;* ListNode(int x) { val = x; }* }*/public class Solution {public ListNode mergeKLists

C++ | Leetcode C++题解之第393题UTF-8编码验证

题目: 题解: class Solution {public:static const int MASK1 = 1 << 7;static const int MASK2 = (1 << 7) + (1 << 6);bool isValid(int num) {return (num & MASK2) == MASK1;}int getBytes(int num) {if ((num &

【每日一题】LeetCode 2181.合并零之间的节点(链表、模拟)

【每日一题】LeetCode 2181.合并零之间的节点(链表、模拟) 题目描述 给定一个链表,链表中的每个节点代表一个整数。链表中的整数由 0 分隔开,表示不同的区间。链表的开始和结束节点的值都为 0。任务是将每两个相邻的 0 之间的所有节点合并成一个节点,新节点的值为原区间内所有节点值的和。合并后,需要移除所有的 0,并返回修改后的链表头节点。 思路分析 初始化:创建一个虚拟头节点

C语言 | Leetcode C语言题解之第393题UTF-8编码验证

题目: 题解: static const int MASK1 = 1 << 7;static const int MASK2 = (1 << 7) + (1 << 6);bool isValid(int num) {return (num & MASK2) == MASK1;}int getBytes(int num) {if ((num & MASK1) == 0) {return

【JavaScript】LeetCode:16-20

文章目录 16 无重复字符的最长字串17 找到字符串中所有字母异位词18 和为K的子数组19 滑动窗口最大值20 最小覆盖字串 16 无重复字符的最长字串 滑动窗口 + 哈希表这里用哈希集合Set()实现。左指针i,右指针j,从头遍历数组,若j指针指向的元素不在set中,则加入该元素,否则更新结果res,删除集合中i指针指向的元素,进入下一轮循环。 /*** @param

LeetCode:64. 最大正方形 动态规划 时间复杂度O(nm)

64. 最大正方形 题目链接 题目描述 给定一个由 0 和 1 组成的二维矩阵,找出只包含 1 的最大正方形,并返回其面积。 示例1: 输入: 1 0 1 0 01 0 1 1 11 1 1 1 11 0 0 1 0输出: 4 示例2: 输入: 0 1 1 0 01 1 1 1 11 1 1 1 11 1 1 1 1输出: 9 解题思路 这道题的思路是使用动态规划