High-accuracy approximation of piecewise smooth functions using the Truncation and Encode approach

Abstract In the present work, we analyze a technique designed by Geraci et al. in [1,11] named the Truncate and Encode (TE) strategy. It was presented as a non-intrusive method for steady and non-steady Partial Differential Equations (PDEs) in Uncertainty Quantification (UQ), and as a weakly intrusive method in the unsteady case. We analyze the TE algorithm applied to the approximation of functions, and in particular its performance for piecewise smooth functions. We carry out some numerical experiments, comparing the performance of the algorithm when using different linear and non-linear interpolation techniques and provide some recommendations that we find useful in order to achieve a high performance of the algorithm.


Introduction
Uncertainty Quantification (UQ) is the science that studies the quantification and the reduction of uncertainties in real applications with an intensive computational component. An example would be the computation of the fuel consumption of a car. Suppose we know how to compute the consumption as a function of some parameters (car and wind velocity, wheel condition, car weight, etc.), but we do not know the exact value of these parameters because they are random variables. Hence, the fuel consumption is also a random variable. UQ studies how to infer the model of the fuel consumption random variable by assuming models for the parameters.
To carry out such study, many simulations of the consumption, changing the parameters among simulations, maybe required. These computations tend to be very expensive, because they may involve numerically solving systems of partial differential equations.
One of the aims of UQ is to find the statistical moments of the random variable that provides the solution to a given problem. In [1,11], the authors propose a new multi-scale technique, the Truncation and Encode approach (TE henceforth) to be applied to lower the cost of the computation (by quadrature rules) of the moments of the solution of the following stochastic PDE ∂ t u(x,t, ξ ) + ∂ x f (x,t, ξ , u(x,t, ξ )) = 0, where ξ ∈ R N is a random vector, with known probabilistic distribution. Notice that for each ξ , a PDE must be solved to compute u ξ (x,t) = u(x,t, ξ ), namely ∂ t u ξ (x,t) + ∂ x f ξ (x,t, u ξ (x,t)) = 0.
Consequently, the calculation of these moments is a computationally intensive task, since function evaluations for given parameter values usually require a complex numerical simulation. Many numerical methods have been specially developed for this kind of UQ problems (for instance, methods generalizing or improving the classical Monte Carlo method, or those based on the Polynomial Chaos representation [14]) and all of them try to use as few function evaluations as possible.
The TE technique is specifically designed to meet a target accuracy in the computation of the desired integral, by ensuring a predetermined accuracy in the required function evaluations. It is specially suited for the integration of functions which are only piecewise smooth, since in this case other alternatives such as the Polynomial Chaos method may be highly inefficient.
In this paper we analyze a basic ingredient of the TE technique designed in [1,11]: the influence of the approximation method for piecewise smooth functions in the global efficiency of the TE technique. For this, we consider a simplified framework in which we seek to compute integrals of the type E =ˆ1 0 f (ξ )dξ by means of a quadrature rule using point values on an equally spaced grid. For the sake of simplicity, let us assume that we use the trapezoidal rule and a uniform mesh of grid-size h K = 2 −K , i.e.
By standard results, we know that when f is sufficiently smooth when f is only piecewise smooth. The TE algorithm seeks to replace as many as possible v K i values, which require function evaluations of f , with modifiedv K i values, so thatÊ where ε is a user-dependent predetermined accuracy, even when f is only piecewise smooth. Notice that Hence, Thus, for a given (piecewise smooth) f , it is possible to ensure a target accuracy by choosing appropriately the mesh (i.e. K) and the target accuracy ε so that ||v The TE algorithm in [1,11] is based on Harten's Multiresolution Framework (MRF), which provides a set of tools to manage data in a multi-scale setting. In the last two decades, the MRF has been successfully applied in various contexts, and in particular in the design of adaptive schemes for the numerical solution of conservation laws and systems [6][7][8]13]. The TE algorithm follows a strategy similar to that used in the cost-effective schemes described in [7] and used in [6,8,13]. Both algorithms use the ideas in Harten's MRF to compute a sequencev K such that ||v i=0 . These strategies proceed from coarse to fine resolution levels as follows: At each resolution level k ≤ K, (associated to a uniform grid with spacing 2 −k ), a vectorv k is obtained, whose components either coincide with or approximate the values of f on the corresponding grid, i.e.v k i ≈ f (i2 −k ). Approximations are computed in [1,7,11] via an interpolatory reconstruction. The TE authors also consider a multilevel thresholding strategy (ε k ) K k=0 , which should be properly chosen to guarantee certain precision in the outputv K : In this work we show that the simple strategy ε k = ε, also considered in [7], is more effective than the one proposed in [1,11] for the current purpose, ε k = 2 K−k ε. Another advantage of taking ε k = ε is that K can be taken as large as needed without increasing the number of function evaluations. Thus, |E − E K | can be made arbitrary small in (1), which implies |E −Ê K | ε.
In this work we also analyze the use of high accuracy interpolation techniques, and we establish the growth rate of the number of evaluations (n eval ) respect to the precision ( v K −v K ∞ ): where s is the approximation order of the interpolation technique. This result is independent of the smoothness of f . That is, the number of function evaluations increases slowly even if f has some discontinuities. Hence the TE algorithm is very convenient in such situations. A comparison between linear and non-linear interpolation techniques is also carried out. Our numerical results will show that non-linear techniques are dispensable in the TE strategy. Since linear ones are faster, computationally speaking, they are preferable. In particular, we make the comparison between piecewise polynomial Lagrange interpolation of degrees 1 and 3, and the nonlinear PCHIP interpolation developed in [5]. In [1,11], the ENO interpolation was chosen instead of PCHIP, but PCHIP interpolation is monotone, non-oscillatory, stable as a subdivision process and faster than the ENO technique.
The paper is organized as follows: In section 2 we recall Harten's MRF and define the specific prediction schemes to be considered for our study. In section 3 we recall the Truncation and Encode strategy described in [1,11] and in section 4 we perform a series of numerical experiments that confirm our observations. We close the paper with some conclusions and perspectives for future work.

Presentation
The TE technique is based on Harten's Multiresolution Framework (MRF), which relies on two basic elements: discretization and (compatible) reconstruction operators. In this paper we only consider the interpolatory framework for functions defined in the unit interval. The reader is referred to [3,12] for a more complete description of Harten's MRF.
In the interpolatory framework, the discretization operators (D k ) k≥0 : obtain the point values of a function f on a sequence of nested grids (ξ k ) k≥0 , i.e.
Together with the discretization operators, a sequence of interpolatory reconstruction operators (R k ) k≥0 is considered, which verify Since R k is interpolatory, the compatibility condition is fulfilled, where I k : R n k +1 −→ R n k +1 is the identity operator. Note 1. An example of these concepts is the following. On the one hand, let be ξ k i = i2 −k , n k = 2 k . Then On the other hand, we consider the reconstruction operators where I 1 is the 1st degree polynomial interpolation: Actually, R k v k is the unique polygonal satisfying D k f is the representation of f at the resolution level k, and will be denoted by v k . In Lemma 3.1 of [12] it is proved that the operator that sends v k+1 = D k+1 f to v k = D k f is well-defined and linear. It is called the decimation operator and it is written D k k+1 . Moreover, it can always be expressed as although R k+1 could be non-linear. In particular, for the point-valued discretization D k , the decimation operator is Inversely, the prediction operator P k+1 k gives an approximation of D k+1 f from D k f : Notice the similarity with (3). The error of P k+1 k is measured as By (4), (3) and (2), in this order, So the even values of δ k+1 are 0. Hence, we will pay attention exclusively to the odd values of δ k+1 , which are named the detail coefficients: For simplicity, let us denote v k := D k f . Then Let be a ∈ R α and b ∈ R β . Let us denote by (a; b) ∈ R α+β the concatenation of a and b. Since v k ∈ R n k +1 = R n k+1 /2+1 and d k ∈ R n k+1 /2 , then (v k ; d k ) ∈ R n k+1 +1 . In fact, there exists a bijection between (v k ; d k ) and v k+1 ∈ R n k+1 +1 : Abusing of notation, we denote by (a 1 ; a 2 ; a 3 ; . . .) the concatenation of many vectors. Note that (6) can be applied recursively to obtain the following equivalences The multiresolution decomposition of k levels, M k , (or multiresolution transform) is defined as We denote its inverse, the inverse multiresolution transform, by

Interpolatory prediction operators
A classical approach to design prediction operators consists in using piecewise polynomial interpolation techniques, I (ξ ; ξ k , v k ). See for instance Remark 1. Given some values v k on a grid ξ k , they provide a function of ξ satisfying The corresponding prediction operators are obtained by evaluating I ( · ; ξ k , v k ) on the finer grid ξ k+1 : For equally-spaced grids, that is ξ k i+1 − ξ k i = h k , the reconstruction technique can usually be expressed in terms of a local rule I, as follows: for some l, r > 0. In such case, The accuracy of the interpolation technique is an important aspect to be taken into account. I has order of approximation s if , for any f sufficiently smooth. It can be written in terms of the local rule as Note 2. Some examples of local rules are • Polygonal rule or 1st degree polynomial rule: • Cubic rule or 3rd degree polynomial rule: • PCHIP rule: where H(x, y) = 2xy x+y if x, y have the same sign, H(x, y) = 0 otherwise. Notice that the polygonal and cubic rules are linear, while the PCHIP rule is nonlinear. It can be proven that the polygonal rule has approximation order 2, the cubic rule has order 4, and the PCHIP rule has order 4 if is strictly monotone but order 2 in general.
Prediction operators in Harten's MRF define subdivision schemes. These are iterative process where denser and denser sets of data are generated using recursively the local rules. A subdivision scheme is convergent if this process converges to a continuous function. See for instance [4,9,10].
In [5] the subdivision scheme defined from the PCHIP rule was studied. Its authors found some advantages respect to other subdivision schemes, such as the ENO scheme [2]. They proved that PCHIP is monotonicity preserving, which means that it converges to a monotone function if monotone data is used. But also it is Lipschitz stable (see [5] for specific definitions) which makes it more suitable as prediction operators in Harten's MRF than the non-linear Essentially Non Oscillatory (ENO) reconstruction techniques used in [1,11]. The monotonicity preserving property is particularly interesting when data coming from discontinuous functions needs to be handled, because it avoids oscillatory behavior in the reconstructed data around jumps and steep gradients (see [5]), just as the ENO reconstructions.
If f is sufficiently smooth, by (7) We are assuming that (ξ k ) k≥0 are nested and equal-spaced. Thus If f has a discontinuity in its s 0 derivative, 0 ≤ s 0 ≤ s, then

The Truncation and Encode approach
In this section, we describe the TE method restricted to the approximation of functions. Given a target error ε > 0, a piecewise smooth function f : [0, 1] −→ R and a suitable finest resolution level K, the goal is to obtain somev using as few evaluations of f as we can. The algorithm computes recursivelyv k , from the coarsest level k = 0 to the finest one k = K. The construction of (v k ) K k=0 begins by evaluating f at ξ 0 and ξ 1 :v 0 = D 0 f andv 1 = D 1 f . The details are computed using (5): We computev k , k > 1, iteratively: For each 0 ≤ i ≤ n k , and for each 1 ≤ k ≤ K (K pre-fixed), we setv k+1 Figure 1 shows a sketch of recurrence (9). It has a simple interpretation: On one hand, if |d k−1 j | < ε k , then a 'small' prediction error indicates that P k k−1 approximates well the function at ξ k 2 j+1 ; thus P k+1 k may also provide a good approximation at ξ k+1 2i+1 , i ∈ {2 j, 2 j + 1}, and we do not need to evaluate f at these points. On the other hand, |d k−1 j | ≥ ε k means that P k k−1 did not provide a sufficiently 'good' approximation to the function values, hence we prefer to evaluate f to obtainv k+1 2i+1 as the exact value of v k+1 2i+1 . It should be noted and v k+1 4 j+3 (i = 2 j + 1) must be computed, so f is evaluated twice. In the end of the recurrence, we obtain the vectorsv K and (d k ) K−1 k=0 . Moreover, we know the MR decomposition ofv k because of (10): In [1] the authors arrive to the conclusion that (8) if fulfilled if ε k = 2 K−k ε. However, we notice in the numerical experiments of [1,11] that this thresholding strategy leads to Thus, the real precision ofv K depends on ε and K. Moreover, if a large K is needed, because a very fine mesh is demanded, then it turns out that v K −v K ∞ is unnecessarily small and the technique looses efficiency. We will see that the thresholding criterion ε k = 2 K−k ε proposed in [1,11] can be relaxed to ε k = ε, to ensure (8) with a precision that stays close to ε. Our strategy reduces the number of evaluations. In addition, the non-dependency of K allows us to take this value as large as necessary to ensure the desired accuracy in the numerical integration of f . Another advantage of using ε k = ε is that it is not mandatory to prefix the finest level K. So we can select another criterion to stop the recurrence, for example, that certain amount of evaluations has been reached. On the opposite, taking ε = 2 K−k ε demands knowing K before the execution starts.

Numerical experiments
The purpose of the present section is to examine the efficiency of the TE algorithm under the various strategies considered in the previous sections. In particular, we show that the threshold strategy ε k = ε, k ≥ 0, leads to the desired target accuracy in a much more efficient manner than ε k = 2 K−k ε, as proposed in [1,11].
In this paper, the efficiency is a measure of the number of evaluations required to reach a target error in the computation ofv K . We show in this section that the efficiency of the TE strategy relies on the interpolation technique used in the prediction operator. In addition, we show numerically that the strategy ε k = ε ensures where s is the order of approximation of the local rule, but also we show that the number of evaluations increases slowly respect to the precision: Hence, highly accurate interpolation techniques improves the efficiency.
On the other hand, we will see that non-linear interpolation techniques does not provide, in general, any advantage respect to the linear ones. Thus, linear prediction operators are preferable, because they are faster to compute.
The TE authors consider the following efficiency measures. The first one is µ cr = n K + 1 n w + n 0 + 1 , n w = #{|d k i | ≥ ε}, n eval : Number of function evaluations done by TE to computev K .
n K + 1 = 2 K + 1: Number of function evaluations needed to compute v K .
E K is obtained applying the trapezoidal rule tov K .  where n w is the number of detail coefficients greater than ε. The second one is where n eval is the total amount of function evaluations used to obtainv K . τ = 1 if the function was evaluated everywhere in ξ K , and the larger τ is, the more v k i were interpolated instead of evaluated. In the TE method, each |d k−1 i | ≥ ε implies two new function evaluations at the resolution level k + 1. So n eval can be written as n eval = 2n w + n 0 .
Using (11), τ can be expressed as as shown in Tables 1 and 2 of [1,11]. In this paper, we will only consider τ.
In Tables 2 to 7, we carry out the same experiment as in section 5.1 of [11]. Let us consider the parameters of Table 1 and the following functions, which are shown in Figure 2: In Tables 2 to 4 we apply the TE strategy with ε k = 10 −1 , k ≥ 0, to the smooth f 1 function in Figure 2 and display the parameters of Table 1. We use the prediction operators, P k+1 k described in Remark 2, i.e. the polygonal, cubic and PCHIP rules, while in [11] it was used the polygonal, cubic and ENO rules. As mentioned previously, we consider that the PCHIP prediction is more convenient because of its stability. As in the original papers, we set n 0 = 1, so ξ 0 = {0, 1}. From these experiments we extract the following conclusions.  Table 2 Parameters as specified in Table 1 for f 1 , ε = ε k = 10 −1 and the polygonal rule.  Table 3 Parameters as specified in Table 1 for f 1 , ε = ε k = 10 −1 and the cubic rule.
As observed in Table 2, the polygonal rule uses 13 evaluations for K = 3, but 17 for K ≥ 4. The cubic and PCHIP rules only uses 13 evaluation for any K ≥ 3, as shown in Tables 3 and 4. From this fact, we deduce that there is a level k 0 such that for all k ≥ k 0 the TE algorithm does not evaluate f 1 anymore. Actually, this property may hold for a general smooth function and a given threshold ε > 0.
Notice also that e K = v K −v K ∞ is always smaller than ε, but it stays close to ε, a clear advantage of using ε k = ε. Indeed, it can be observed in Tables 1 and 2 of [1,11] that e K tends to zero when K increases, when the strategy ε k = 2 K−k ε was selected. Since our target accuracy is e K ε with as few evaluations as possible, if the TE strategy uses just the necessary evaluations, then e K should be close to ε. Thus, the fact that e K tends to zero with the criterion ε k = 2 K−k ε means that the function is being evaluated more than it is actually needed.
As exposed in Section 1, the next condition is fulfilled: Since f 1 is smooth, then |E − E K | = O(2 −2K ). Thus, for large values of K, (13) becomes which can also be checked in Tables 2 to 4 for any K > 3.
The smallest e K is achieved with the cubic rule, and the largest with the polygonal rule. Between them is the PCHIP rule, with less e K and less evaluations than the polygonal rule, but larger e K with the same evaluations  Table 4 Parameters as specified in Table 1 for f 1 , ε = ε k = 10 −1 and the PCHIP rule.  Table 5 Parameters as specified in Table 1 for f 2 , ε = ε k = 10 −1 and the polygonal rule.  Table 6 Parameters as specified in Table 1 for f 2 , ε = ε k = 10 −1 and the cubic rule.
than the cubic rule. Hence the cubic rule here seems to be better than PCHIP rule for smooth functions, which is probably a consequence of the better approximation properties of the cubic rule at all smooth regions, while the PCHIP formula its as accurate as the cubic rule only at monotone regions.
In Tables 5 to 7 we repeat the same experiment for the function f 2 , which has a discontinuity jump in ξ = 11/20. Once again, we use ε k = ε = 10 −1 , and we explore the effect of increasing K.
For K > 4 in Table 5 and K > 3 in Tables 6 and 7, the TE algorithm evaluates f 2 only two more times in each new level. This happens because there is only one detail coefficient satisfying |d k−1 j | ≥ ε, which implies two new evaluations at level k + 1. We refer to Figure 1 for the sake of clarity. This d k−1 j is located around the discontinuity, which is easily seen in Fig. 5 of [11].
In spite of the discontinuity of f 2 , v K −v K ∞ is smaller than ε, and also is near ε, which supports the criterion ε k = ε. Now |E − E K | = O(2 −K ), thus (14) is satisfied for K > 5.
Note that the polygonal rule uses more f 2 evaluations, but also its e K are smaller than the other rules. e K has similar magnitudes in the PCHIP and cubic rules and both use the same number of evaluations.
Once again, e K converges to some value close to ε for K → ∞, in contrast to observed in Tables 1 and 2 of [1,11], where e K → 0, which confirms that the thresholding strategy ε k = ε, instead of ε k = 2 K−k ε, is the most appropriate to attain a predetermined target accuracy in the computation.
we repeat the same experiment for a fixed value of K, K = 15, and several ε values. We do not consider τ in Tables 8 to 13, because if K is fixed, then τ = O(n −1 eval ). In the following, we study the approximation of the smooth function f 1 , which is shown in Tables 8 to 10.
The first feature we observe is that (15) is satisfied, which provides a strong control on the error e K = v K −v K ∞ . We also notice that the growth rate of the number of evaluations, n eval , is larger in the polygonal rule. We can easily see it plotting the results of the tables, shown in Figure 3  because, in the TE algorithm, the rule is applied in small monotone regions along the whole interval [0,1], where PCHIP has order 4. In addition, we clearly observe that cubic is a bit better than PCHIP, but much better than polygonal.
Another issue we can observe in Tables 8 to 10 is that |E −Ê K | decrees together with ε, if ε ≥ 10 −8 , but for smaller ε, the precision of the integral gets stuck around 10 −11 . It can be explained using the triangle inequality: Thus, This says that, if ε → 0, then |E −Ê K | is lower bounded by 7.8e-11| f ′′ (η)|, which should be close to the number 1.5e-11, shown in Tables 8 to 10. Comparing Figure 4 in this paper with Fig. 4 of [11], we note that our plot displays less oscillations. This may indicate that the dependency between e K and the number of evaluations is stronger and smoother when the criterion ε k = ε is selected.
We repeat the experiment for the function f 2 , which has a discontinuity jump in ξ = 11/20, in Tables 5 to 7. Notice that (15) is satisfied and the growth of n eval is faster in the polygonal rule. Observing the Figures 5  and 6, corresponding to these tables, we observe that the growth rates are n eval = O(e ) for cubic and PCHIP rules. As shown in Figure 5, cubic rule is better than PCHIP rule to approximate v K . However, PCHIP rule is now better than cubic rule to compute the integral. At least, it is true for ε > 10 −4 , before |E −Ê K | gets stuck.
A similar argument to (16) can be used to explain that |E −Ê K | → 1.1e-06 in Tables 5 to 7. Since f 2 is not continuous, the error of the trapezoidal rule decreases as O(2 −K ). Thus  Table 12 Parameters as specified in Table 1 for f 2 , K = 15 and the cubic rule. ε k = ε varies between 10 −1 and 10 −10 .  Table 13 Parameters as specified in Table 1 for f 2 , K = 15 and the PCHIP rule. ε k = ε varies between 10 −1 and 10 −10 .
Again, a comparison between our Figure 4 and Fig. 6 of [11] seems to indicate that the strategy ε k = ε makes the TE approach more robust than ε k = 2 K−k ε.
We conclude that PCHIP only provides a better performance respect to the usual cubic interpolation when the integral of a discontinuous function is computed. However, the cubic rule was better in all the other cases: integration of a continuous functions and approximation of continuous or even discontinuous functions. The TE authors also arrived to the conclusion that no advantages were seen using the ENO interpolation for f 1 , but ENO was better for f 2 in terms of accuracy and amount of evaluations.
The results included in the present paper indicate that non-linear interpolatory reconstructions may not lead to substantial advantages over linear ones within the TE framework. We have observed that, in general, the cubic interpolation lead to better results than the PCHIP interpolation, while both always fulfill n eval = O(e −1/4 K ). On the other hand, the usual reason of using ENO or PCHIP interpolatory techniques is the capability of approximating jumps in the data without spurious oscillations. But in TE strategy this is not needed, because the algorithm itself has a mechanism to detect the problematic zones of the function. In addition, linear interpolations are always faster than nonlinear ones.
We wish to remark again that the choice ε k = ε confers a high control over the error. In contrast, the original selection ε k = 2 K−k ε done by the TE authors only guarantees that ε is an upper bound of the error. Observe that in [1,11] all the numerical experiments were done with ε = 10 −1 , but the error converges to zero as soon as K → ∞. In our case, the error converges to some number under ε, but greater than 2 −s ε, where s is the order of the interpolation technique. This is a practical advantage, because K determines the resolution of the approximationv K , and we can set it as large as we need, but the number of evaluations and the error depends exclusively on ε.

Conclusions
In this paper we have examined some features of the Truncation and Encode strategy described in [1,11], in particular its ability to compute integrals by quadrature rules to a given target accuracy.  In the TE strategy, the values used in the quadrature rule are computed by combining function evaluations and suitably interpolated values in a multilevel fashion, following the basic guidelines of Harten's MRF.
Our theoretical observations and numerical experiments allow us to conclude that the TE technique becomes more efficient if the thresholding strategy proposed in [1,11] is relaxed to ε k = ε for all resolution levels. In addition, higher order prediction schemes lead to improved the performance, but it does not seem necessary to resort to nonlinear prediction schemes.
More research should be done to support these conclusions at a theoretical level.