Linux使用手册

查找

`find`

-ls 和 ls 命令一样，列出详细信息，如：find . -name '*.py' -ls
-maxdepth 最大迭代深度，如果使用这个参数，那么这个参数应该放到最前面

使用 delete 参数：

find . -maxdepth 1 -type f -name '*~' -delete
find . -maxdepth 1 -type f -name '*.pyc' -delete

使用 exec 参数：

find . -type f -exec chmod 644 {} \;
find . -name "*.php" -exec mv {} ~/codes/php \;

find . -type d -name "__pycache__" -exec rm -rf \;
find . -type d -name "__pycache__" -exec rm -rf {} +
find . -type d -name "__pycache__" -depth +4 -print0 -exec rm -rf {} +

联合 xargs 命令：

find . -type d -empty -print0 | xargs -0 -I {} /bin/rm -rf "{}"
find /var/cache/TheSunProject/ -type d -name "backup_*" -print0 | xargs -0 -I {} /bin/rm -rf "{}"
find . -type d -mtime +30 -print0 | xargs -I dir -0 /bin/rm -rvf "dir" > /tmp/delete.log

使用 ls 参数：

find $HOME -maxdepth 1 -type d -ls

# ls 输出非 ASCII 字符会转义，对中文环境不友好
find $HOME -maxdepth 1 -exec ls -ldh {} + | column -t

大于/小于 10M 的文件：

find . -maxdepth 1 -type f -size +10M
find . -maxdepth 1 -type f -size -10M

# 列出小于三兆的 mp3 文件
find ~/Music -name '*.mp3' -size -3M -ls

正则：

find ./apps/api/ -maxdepth 3 -regextype "posix-egrep" -regex ".*/(vpc.py|server.py)"

空目录：

find . -type d -empty
find . -type d -empty -exec rmdir {} \;

# 空文件
find . -type f -empty

权限：

# 精准匹配
find . -type f -perm 755 -iname "*.md" -exec chmod 644 {} \;
# 与：u g o 都有执行权限，最低权限判断
find . -type f -perm -111 -iname "*.md" -exec chmod 644 {} \;
# 或：u+x | g+x | o+x 满足一个条件即可
find . -type f -perm /111 -iname "*.md" -exec chmod 644 {} \;

# 这几个参数可以注意一下：
# -writable
# -readable
# -executable

创建和访问时间：

-amin 访问时间（分钟）
-atime 访问时间（天）
-cmin 创建时间（分钟）
-ctime 创建时间（天）
-mmin 修改时间（分钟）
-mtime 修改时间（天）

值得注意的是，上面按天为单位的那些时间过滤参数是按整除来算的，比如：find $HOME -mtime 0，表示一天以内，应为修改时间整除 24 小时为 0 的就是 24 小时以内。

`locate`

和 find -name 相似，不过查找的是索引数据库（/var/lib/mlocate/mlocate.db），所以速度快很多，只是实时性差一点，因为 Linux 系统过一段时间（好像是一天）才更新一次这个数据库。运行 updatedb 可以手动更新这个数据库。

如果没有这个命令，需要安装 mlocate 包。

`whereis`

用于搜索程序有关信息。

b 二进制程序
m man 说明文档
s 源代码

如果以上三个参数都没有，那么就是返回所有信息。

`which`

这个命令则比 whereis 的搜索范围更加窄，专门搜索 PATH 下的所有目录。
多个 PATH 目录下可能有同名的程序，这个命令可以告诉我们 bash 调用到的究竟是哪一个，即通过返回查找到的第一个程序。

`type`

type 用来区分某个命令是否是由 shell 内建程序提供，使用 -p 参数，会显示该命令的路径，相当于 which 命令。比如：

$ type cd
cd is a shell builtin
$ type git
git is /usr/bin/git
$ type -p git
/usr/bin/git

`grep`

从文件内容查找匹配指定字符串的行： $ grep "被查找的字符串" 文件名

-i 查找时不区分大小写
-c 查找匹配的行数
-v 查找不匹配行
-e 正则查找
-a, --text, --binary-files=text
-n, --line-number
-B, --before-context=NUM
-A, --after-context=NUM
-C, --context=NUM

Pattern selection and interpretation:
  -E, --extended-regexp     PATTERN is an extended regular expression
  -F, --fixed-strings       PATTERN is a set of newline-separated strings
  -G, --basic-regexp        PATTERN is a basic regular expression (default)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -e, --regexp=PATTERN      用 PATTERN 来进行匹配操作
  -f, --file=FILE           从 FILE 中取得 PATTERN
  -i, --ignore-case         忽略大小写
  -w, --word-regexp         强制 PATTERN 仅完全匹配字词
  -x, --line-regexp         强制 PATTERN 仅完全匹配一行
  -z, --null-data           一个 0 字节的数据行，但不是空行

其他

查找当前目录下含有指定内容的行
1. find . -type f -name '*.log' | xargs grep 'error'
2. find . -type f -exec grep 'error' -l {} \; # 据说比前一种方法效率高些
  说明： -exec 命令 {}：对找到的匹配文件，执行所列出的命令，而不询问用户是否执行这些命令，参数 {} 由 find 找到当前的文件路径名取代，命令行末尾必须有“\;”。另一个参数 -ok 和 -exec 相似，只是每次执行命令前都进行询问。
查找重复文件

参考：http://blog.csdn.net/zixiaomuwu/article/details/50878383
find -not -empty -type f -printf “%s\n” | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 —all-repeated=separate
- uniq -d 打印重复行
- xargs -I{} -n1 find -type f -size {}c -print0
  - xargs -I{} -n1 使用 {} 代表参数逐个传递
  - find -type f -size {}c -print0 查找指定大小的文件
- uniq -w32 –all-repeated=separate 对比前 32 字节，筛选出重复文件
  
  我的用法：
```
duplicate_files () {
  find $1 -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate > /tmp/duplicate_files_result.txt
  cat /tmp/duplicate_files_result.txt | cut -c 35- | tr -s '\n'
}
find -type f -name "*.md" -print0 | xargs -0 md5sum | sort  | uniq -w32  -D | awk -F '  ' '{print $2}'
```

uniq

-c, --count
        prefix lines by the number of occurrences

-d, --repeated
        only print duplicate lines, one for each group

-D     print all duplicate lines

--all-repeated[=METHOD]
        like -D, but allow separating groups with an empty line; METHOD={none(default),prepend,separate}

-f, --skip-fields=N
        avoid comparing the first N fields

--group[=METHOD]
        show all items, separating groups with an empty line; METHOD={separate(default),prepend,append,both}

-i, --ignore-case
        ignore differences in case when comparing

-s, --skip-chars=N
        avoid comparing the first N characters

-u, --unique
        only print unique lines

-z, --zero-terminated
        line delimiter is NUL, not newline

-w, --check-chars=N
        compare no more than N characters in line

侧边栏

Linux使用手册

查找

find

locate

whereis

which

type

grep

其他