library/alloc/src/ffi/mod.rs

   1 //! Utilities related to FFI bindings.
   2 //!
   3 //! This module provides utilities to handle data across non-Rust
   4 //! interfaces, like other programming languages and the underlying
   5 //! operating system. It is mainly of use for FFI (Foreign Function
   6 //! Interface) bindings and code that needs to exchange C-like strings
   7 //! with other languages.
   8 //!
   9 //! # Overview
  10 //!
  11 //! Rust represents owned strings with the [`String`] type, and
  12 //! borrowed slices of strings with the [`str`] primitive. Both are
  13 //! always in UTF-8 encoding, and may contain nul bytes in the middle,
  14 //! i.e., if you look at the bytes that make up the string, there may
  15 //! be a `\0` among them. Both `String` and `str` store their length
  16 //! explicitly; there are no nul terminators at the end of strings
  17 //! like in C.
  18 //!
  19 //! C strings are different from Rust strings:
  20 //!
  21 //! * **Encodings** - Rust strings are UTF-8, but C strings may use
  22 //! other encodings. If you are using a string from C, you should
  23 //! check its encoding explicitly, rather than just assuming that it
  24 //! is UTF-8 like you can do in Rust.
  25 //!
  26 //! * **Character size** - C strings may use `char` or `wchar_t`-sized
  27 //! characters; please **note** that C's `char` is different from Rust's.
  28 //! The C standard leaves the actual sizes of those types open to
  29 //! interpretation, but defines different APIs for strings made up of
  30 //! each character type. Rust strings are always UTF-8, so different
  31 //! Unicode characters will be encoded in a variable number of bytes
  32 //! each. The Rust type [`char`] represents a '[Unicode scalar
  33 //! value]', which is similar to, but not the same as, a '[Unicode
  34 //! code point]'.
  35 //!
  36 //! * **Nul terminators and implicit string lengths** - Often, C
  37 //! strings are nul-terminated, i.e., they have a `\0` character at the
  38 //! end. The length of a string buffer is not stored, but has to be
  39 //! calculated; to compute the length of a string, C code must
  40 //! manually call a function like `strlen()` for `char`-based strings,
  41 //! or `wcslen()` for `wchar_t`-based ones. Those functions return
  42 //! the number of characters in the string excluding the nul
  43 //! terminator, so the buffer length is really `len+1` characters.
  44 //! Rust strings don't have a nul terminator; their length is always
  45 //! stored and does not need to be calculated. While in Rust
  46 //! accessing a string's length is an *O*(1) operation (because the
  47 //! length is stored); in C it is an *O*(*n*) operation because the
  48 //! length needs to be computed by scanning the string for the nul
  49 //! terminator.
  50 //!
  51 //! * **Internal nul characters** - When C strings have a nul
  52 //! terminator character, this usually means that they cannot have nul
  53 //! characters in the middle — a nul character would essentially
  54 //! truncate the string. Rust strings *can* have nul characters in
  55 //! the middle, because nul does not have to mark the end of the
  56 //! string in Rust.
  57 //!
  58 //! # Representations of non-Rust strings
  59 //!
  60 //! [`CString`] and [`CStr`] are useful when you need to transfer
  61 //! UTF-8 strings to and from languages with a C ABI, like Python.
  62 //!
  63 //! * **From Rust to C:** [`CString`] represents an owned, C-friendly
  64 //! string: it is nul-terminated, and has no internal nul characters.
  65 //! Rust code can create a [`CString`] out of a normal string (provided
  66 //! that the string doesn't have nul characters in the middle), and
  67 //! then use a variety of methods to obtain a raw <code>\*mut [u8]</code> that can
  68 //! then be passed as an argument to functions which use the C
  69 //! conventions for strings.
  70 //!
  71 //! * **From C to Rust:** [`CStr`] represents a borrowed C string; it
  72 //! is what you would use to wrap a raw <code>\*const [u8]</code> that you got from
  73 //! a C function. A [`CStr`] is guaranteed to be a nul-terminated array
  74 //! of bytes. Once you have a [`CStr`], you can convert it to a Rust
  75 //! <code>&[str]</code> if it's valid UTF-8, or lossily convert it by adding
  76 //! replacement characters.
  77 //!
  78 //! [`String`]: crate::string::String
  79 //! [`CStr`]: core::ffi::CStr
  80
  81 #![stable(feature = "alloc_ffi", since = "1.64.0")]
  82
  83 #[stable(feature = "alloc_c_string", since = "1.64.0")]
  84 pub use self::c_str::FromVecWithNulError;
  85 #[stable(feature = "alloc_c_string", since = "1.64.0")]
  86 pub use self::c_str::{CString, IntoStringError, NulError};
  87
  88 mod c_str;